|
|
Created:
9 years, 3 months ago by noelallen1 Modified:
8 years, 11 months ago CC:
chromium-reviews Base URL:
svn://chrome-svn/chrome/trunk/src/ Visibility:
Public. |
DescriptionCreate scanning script to determine header dependencies.
This python script will take a set of input files and header include directories and scan through the sources trying to find all header references to generate a list of file dependencies. The scanner does not use a preprocessor so all headers in the source even if conditionally compiled out, are returned. Files which can not be found in the file system are ignored.
BUG= http://code.google.com/p/chromium/issues/detail?id=96782
TEST= manual
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=103675
Patch Set 1 #
Total comments: 32
Patch Set 2 : '' #
Total comments: 12
Patch Set 3 : '' #Patch Set 4 : '' #Patch Set 5 : '' #
Total comments: 2
Messages
Total messages: 8 (0 generated)
Might be worth writing a test. You could mock out os.path.exists and maybe add a ReadFile function or something to make it simple to mock that. This is gonna be a really handy piece of code to have around. Heart and soul of any decent build system really. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py File build/scan_sources.py (right): http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode10 build/scan_sources.py:10: """ Pull this up onto the line with the """ (per style guide). http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode17 build/scan_sources.py:17: Extra cr here. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode26 build/scan_sources.py:26: self.AddDir('.') This is adding cwd to the search order (the cwd where the script is run). I'm not sure that's quite what you'd want. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode27 build/scan_sources.py:27: self.cwd = os.path.realpath(os.getcwd()) I've noticed you used realpath throughout. I actually suspect you want to use abspath, as I doubt C compilers typically resolve symlinks before operating on paths. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode36 build/scan_sources.py:36: print 'Added dir: %s' % pathname Leftover? I assume these are something you want in the final output? http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode49 build/scan_sources.py:49: print " %s\n %s\n" % (filepath, basepath) Leftover? http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode59 build/scan_sources.py:59: if filename[0] == os.path.sep: os.path.isabs(filename) http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode71 build/scan_sources.py:71: """ Give a one line description per the style guide. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode76 build/scan_sources.py:76: regex = r'(?P<inc>\#include [<"].+[>"])' You did P<inc> but never used that group name. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode76 build/scan_sources.py:76: regex = r'(?P<inc>\#include [<"].+[>"])' .+ -> [^>"]+ http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode77 build/scan_sources.py:77: self.work_q = parent So include <> vs include "" have different behaviors. Though unfortunately different in preprocessor specific ways. Typically "" prepends the directory containing the file doing the including to the search order, whereas <> does not. In some preprocessors this extra path is propagated to child includes (msvs), but not in gcc I believe. Not sure how far we want to go on correctness, but I guess we do hope to use this all over. My guess is its worked locally due to cwd being in the include order? Dug up what scons is doing for instance: """ A Classic Scanner subclass which takes into account the type of bracketing used to include the file, and uses classic CPP rules for searching for the files based on the bracketing. Note that in order for this to work, the regular expression passed to the constructor must return the leading bracket in group 0, and the contained filename in group 1. """ def find_include(self, include, source_dir, path): if include[0] == '"': paths = (source_dir,) + tuple(path) else: paths = tuple(path) + (source_dir,) n = SCons.Node.FS.find_file(include[1], paths) return n, include[1] Appears they assume source_dir is prepended for " and appended for < just in case. Oh interesting, here's the regex scons uses: ^[ \t]*#[ \t]*(?:include|import)[ \t]*(<|")(\[^>"]+)(>|") You likely don't care about #import ObjectiveC-ism, but I think gcc supports it in general. Oh, but you do want to support the spaces/tabs between # and include and </". I don't know if we have any in our code base, but I assume you've seen indented processor style: #ifdef foo # include "a/b/c/d" #else # include "x/y/z" #endif http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode83 build/scan_sources.py:83: for token in self.parser.split(data): Split's a little weird, I'd use findall here as if you () group around the included path, you just get that out. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode94 build/scan_sources.py:94: except: Technically the style guide forbids carte-blanc excepts. except IOError: ? http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode99 build/scan_sources.py:99: """ Pull out the first sentence as a one line description per the style guide. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode119 build/scan_sources.py:119: self.added_set |= set([resolved_name]) -> self.added_set.add(resolved_name) http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode147 build/scan_sources.py:147: if len(arg) > 2 and arg[0:2] == '-I': Style guide calls for consistent use of single or double quotes throughout (except for triple quotes). Single is more common in nacl/chrome files, but not universal. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode147 build/scan_sources.py:147: if len(arg) > 2 and arg[0:2] == '-I': I'm all for avoiding optparse when possible, but I think at the point you're taking -I's its likely time to consider it. Oh by the way, as we're taking args from gyp, do we need to support multiple include dirs in a single arg? http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode160 build/scan_sources.py:160: Style guide is a little vague, but most places seem to have 2 lines between the main guard and the last definiiton in the file. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode163 build/scan_sources.py:163: Drop trailing line.
Very close, could formatting nit and some questions about /s http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py File build/scan_sources.py (right): http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py#newcode17 build/scan_sources.py:17: debug = False If you're gonna leave this in, maybe put options up here and and make this a command line option? http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py#newcode18 build/scan_sources.py:18: cr http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py#newcode21 build/scan_sources.py:21: cr http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py#newcode66 build/scan_sources.py:66: return os.sep.join(rel_parts) Maybe make this use '/' instead of os.sep, and or (proably or) add a separate filtering step to convert to /'s which is what gyp will need. http://codereview.chromium.org/8037013/diff/5001/build/scan_sources_test.py File build/scan_sources_test.py (right): http://codereview.chromium.org/8037013/diff/5001/build/scan_sources_test.py#n... build/scan_sources_test.py:14: Add a line here. http://codereview.chromium.org/8037013/diff/5001/build/scan_sources_test.py#n... build/scan_sources_test.py:78: self.assertEqual(rel, resolver.RealToRelative(full, base)) Will this test do the right thing on windows? In fact, actually we almost certainly want to filter the output on windows to use / instead of os.sep Sigh. http://codereview.chromium.org/8037013/diff/5001/build/scan_sources_test.py#n... build/scan_sources_test.py:94: Add a line here. http://codereview.chromium.org/8037013/diff/5001/build/scan_sources_test.py#n... build/scan_sources_test.py:97: Drop this line.
Added a class which translates all paths into native and back out to POSIX to force all paths seen by code to be POSIX style. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py File build/scan_sources.py (right): http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode10 build/scan_sources.py:10: """ On 2011/09/26 06:21:57, bradn wrote: > Pull this up onto the line with the """ (per style guide). Done. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode17 build/scan_sources.py:17: On 2011/09/26 06:21:57, bradn wrote: > Extra cr here. Done. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode26 build/scan_sources.py:26: self.AddDir('.') On 2011/09/26 06:21:57, bradn wrote: > This is adding cwd to the search order (the cwd where the script is run). I'm > not sure that's quite what you'd want. Removed, I force you to specify it now. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode27 build/scan_sources.py:27: self.cwd = os.path.realpath(os.getcwd()) On 2011/09/26 06:21:57, bradn wrote: > I've noticed you used realpath throughout. > I actually suspect you want to use abspath, as I doubt C compilers typically > resolve symlinks before operating on paths. I use realpath so that two links to the same location resolve to the same path. I'm not sure what benefit you are suggest from using abspath. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode36 build/scan_sources.py:36: print 'Added dir: %s' % pathname On 2011/09/26 06:21:57, bradn wrote: > Leftover? I assume these are something you want in the final output? Not sure what we should do here. We "failed" because we hit an unexpected condition. I want the user to know what the problem was. I return -1, so hopefully the make system will reject it. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode49 build/scan_sources.py:49: print " %s\n %s\n" % (filepath, basepath) On 2011/09/26 06:21:57, bradn wrote: > Leftover? Done. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode59 build/scan_sources.py:59: if filename[0] == os.path.sep: On 2011/09/26 06:21:57, bradn wrote: > os.path.isabs(filename) Done. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode77 build/scan_sources.py:77: self.work_q = parent On 2011/09/26 06:21:57, bradn wrote: > So include <> vs include "" have different behaviors. > Though unfortunately different in preprocessor specific ways. > Typically "" prepends the directory containing the file doing the including to > the search order, whereas <> does not. In some preprocessors this extra path is > propagated to child includes (msvs), but not in gcc I believe. > > Not sure how far we want to go on correctness, but I guess we do hope to use > this all over. My guess is its worked locally due to cwd being in the include > order? > > Dug up what scons is doing for instance: > > > """ > A Classic Scanner subclass which takes into account the type of > bracketing used to include the file, and uses classic CPP rules > for searching for the files based on the bracketing. > > Note that in order for this to work, the regular expression passed > to the constructor must return the leading bracket in group 0, and > the contained filename in group 1. > """ > def find_include(self, include, source_dir, path): > if include[0] == '"': > paths = (source_dir,) + tuple(path) > else: > paths = tuple(path) + (source_dir,) > > n = SCons.Node.FS.find_file(include[1], paths) > > return n, include[1] > > Appears they assume source_dir is prepended for " and appended for < just in > case. > > > Oh interesting, here's the regex scons uses: > > ^[ \t]*#[ \t]*(?:include|import)[ \t]*(<|")(\[^>"]+)(>|") > > You likely don't care about #import ObjectiveC-ism, but I think gcc supports it > in general. > Oh, but you do want to support the spaces/tabs between # and include and </". > > I don't know if we have any in our code base, but I assume you've seen indented > processor style: > > #ifdef foo > # include "a/b/c/d" > #else > # include "x/y/z" > #endif Done. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode119 build/scan_sources.py:119: self.added_set |= set([resolved_name]) On 2011/09/26 06:21:57, bradn wrote: > -> self.added_set.add(resolved_name) Done. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode147 build/scan_sources.py:147: if len(arg) > 2 and arg[0:2] == '-I': On 2011/09/26 06:21:57, bradn wrote: > Style guide calls for consistent use of single or double quotes throughout > (except for triple quotes). Single is more common in nacl/chrome files, but not > universal. Done. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode147 build/scan_sources.py:147: if len(arg) > 2 and arg[0:2] == '-I': On 2011/09/26 06:21:57, bradn wrote: > Style guide calls for consistent use of single or double quotes throughout > (except for triple quotes). Single is more common in nacl/chrome files, but not > universal. Done. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode160 build/scan_sources.py:160: On 2011/09/26 06:21:57, bradn wrote: > Style guide is a little vague, but most places seem to have 2 lines between the > main guard and the last definiiton in the file. Done. http://codereview.chromium.org/8037013/diff/1/build/scan_sources.py#newcode163 build/scan_sources.py:163: On 2011/09/26 06:21:57, bradn wrote: > Drop trailing line. Done. http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py File build/scan_sources.py (right): http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py#newcode17 build/scan_sources.py:17: debug = False On 2011/09/29 20:25:30, bradn wrote: > If you're gonna leave this in, maybe put options up here and and make this a > command line option? Done. http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py#newcode18 build/scan_sources.py:18: On 2011/09/29 20:25:30, bradn wrote: > cr Done. http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py#newcode21 build/scan_sources.py:21: On 2011/09/29 20:25:30, bradn wrote: > cr Done. http://codereview.chromium.org/8037013/diff/5001/build/scan_sources.py#newcode66 build/scan_sources.py:66: return os.sep.join(rel_parts) On 2011/09/29 20:25:30, bradn wrote: > Maybe make this use '/' instead of os.sep, and or (proably or) add a separate > filtering step to convert to /'s which is what gyp will need. Done.
lgtm
I just stumbled upon this script while debugging some problem. Can whatever problem this script solves be solved without doing what this script does? Gyp doesn't support "'sources': ['folder/*.cc']" so that it doesn't need to hit the disk many times when running – this script kind of defeats that. http://codereview.chromium.org/8037013/diff/8003/build/scan_sources.py File build/scan_sources.py (right): http://codereview.chromium.org/8037013/diff/8003/build/scan_sources.py#newcode26 build/scan_sources.py:26: class PathConverter(object): (huh, why do you need this class?) http://codereview.chromium.org/8037013/diff/8003/build/scan_sources.py#newcode40 build/scan_sources.py:40: return os.path.exists(ospath) you don't check for isdir(ospath) here. Someone added a "utility" directory, and now this script thinks that #include <utility> in some other cc file is something that should be returned, leading to an output of both [utility, utility/some_file.h] This confuses gyp, which now thinks that utility is both a file and a directory. |