Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(55)

Side by Side Diff: tools/clang/scripts/run_tool.py

Issue 2599193002: Split run_tool.py into run_tool.py, extract_edits.py and apply_edits.py (Closed)
Patch Set: Addressed remaining nits. Created 3 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « tools/clang/scripts/extract_edits.py ('k') | tools/clang/scripts/test_tool.py » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 #!/usr/bin/env python 1 #!/usr/bin/env python
2 # Copyright (c) 2013 The Chromium Authors. All rights reserved. 2 # Copyright (c) 2013 The Chromium Authors. All rights reserved.
3 # Use of this source code is governed by a BSD-style license that can be 3 # Use of this source code is governed by a BSD-style license that can be
4 # found in the LICENSE file. 4 # found in the LICENSE file.
5 """Wrapper script to help run clang tools across Chromium code. 5 """Wrapper script to help run clang tools across Chromium code.
6 6
7 How to use this tool: 7 How to use run_tool.py:
8 If you want to run the tool across all Chromium code: 8 If you want to run a clang tool across all Chromium code:
9 run_tool.py <tool> <path/to/compiledb> 9 run_tool.py <tool> <path/to/compiledb>
10 10
11 If you want to include all files mentioned in the compilation database: 11 If you want to include all files mentioned in the compilation database
12 (this will also include generated files, unlike the previous command):
12 run_tool.py <tool> <path/to/compiledb> --all 13 run_tool.py <tool> <path/to/compiledb> --all
13 14
14 If you only want to run the tool across just chrome/browser and content/browser: 15 If you want to run the clang tool across only chrome/browser and
16 content/browser:
15 run_tool.py <tool> <path/to/compiledb> chrome/browser content/browser 17 run_tool.py <tool> <path/to/compiledb> chrome/browser content/browser
16 18
17 Please see https://chromium.googlesource.com/chromium/src/+/master/docs/clang_to ol_refactoring.md for more 19 Please see docs/clang_tool_refactoring.md for more information, which documents
18 information, which documents the entire automated refactoring flow in Chromium. 20 the entire automated refactoring flow in Chromium.
19 21
20 Why use this tool: 22 Why use run_tool.py (instead of running a clang tool directly):
21 The clang tool implementation doesn't take advantage of multiple cores, and if 23 The clang tool implementation doesn't take advantage of multiple cores, and if
22 it fails mysteriously in the middle, all the generated replacements will be 24 it fails mysteriously in the middle, all the generated replacements will be
23 lost. 25 lost. Additionally, if the work is simply sharded across multiple cores by
26 running multiple RefactoringTools, problems arise when they attempt to rewrite a
27 file at the same time.
24 28
25 Unfortunately, if the work is simply sharded across multiple cores by running 29 run_tool.py will
26 multiple RefactoringTools, problems arise when they attempt to rewrite a file at 30 1) run multiple instances of clang tool in parallel
27 the same time. To work around that, clang tools that are run using this tool 31 2) gather stdout from clang tool invocations
28 should output edits to stdout in the following format: 32 3) "atomically" forward #2 to stdout
29 33
30 ==== BEGIN EDITS ==== 34 Output of run_tool.py can be piped into extract_edits.py and then into
31 r:<file path>:<offset>:<length>:<replacement text> 35 apply_edits.py. These tools will extract individual edits and apply them to the
32 r:<file path>:<offset>:<length>:<replacement text> 36 source files. These tools assume the clang tool emits the edits in the
33 ...etc... 37 following format:
34 ==== END EDITS ==== 38 ...
39 ==== BEGIN EDITS ====
40 r:::<file path>:::<offset>:::<length>:::<replacement text>
41 r:::<file path>:::<offset>:::<length>:::<replacement text>
42 ...etc...
43 ==== END EDITS ====
44 ...
35 45
36 Any generated edits are applied once the clang tool has finished running 46 extract_edits.py extracts only lines between BEGIN/END EDITS markers
37 across Chromium, regardless of whether some instances failed or not. 47 apply_edits.py reads edit lines from stdin and applies the edits
38 """ 48 """
39 49
40 import argparse 50 import argparse
41 import collections
42 import functools 51 import functools
43 import multiprocessing 52 import multiprocessing
44 import os 53 import os
45 import os.path 54 import os.path
46 import subprocess 55 import subprocess
47 import sys 56 import sys
48 57
49 script_dir = os.path.dirname(os.path.realpath(__file__)) 58 script_dir = os.path.dirname(os.path.realpath(__file__))
50 tool_dir = os.path.abspath(os.path.join(script_dir, '../pylib')) 59 tool_dir = os.path.abspath(os.path.join(script_dir, '../pylib'))
51 sys.path.insert(0, tool_dir) 60 sys.path.insert(0, tool_dir)
52 61
53 from clang import compile_db 62 from clang import compile_db
54 63
55 Edit = collections.namedtuple('Edit',
56 ('edit_type', 'offset', 'length', 'replacement'))
57
58 64
59 def _GetFilesFromGit(paths=None): 65 def _GetFilesFromGit(paths=None):
60 """Gets the list of files in the git repository. 66 """Gets the list of files in the git repository.
61 67
62 Args: 68 Args:
63 paths: Prefix filter for the returned paths. May contain multiple entries. 69 paths: Prefix filter for the returned paths. May contain multiple entries.
64 """ 70 """
65 args = [] 71 args = []
66 if sys.platform == 'win32': 72 if sys.platform == 'win32':
67 args.append('git.bat') 73 args.append('git.bat')
(...skipping 10 matching lines...) Expand all
78 def _GetFilesFromCompileDB(build_directory): 84 def _GetFilesFromCompileDB(build_directory):
79 """ Gets the list of files mentioned in the compilation database. 85 """ Gets the list of files mentioned in the compilation database.
80 86
81 Args: 87 Args:
82 build_directory: Directory that contains the compile database. 88 build_directory: Directory that contains the compile database.
83 """ 89 """
84 return [os.path.join(entry['directory'], entry['file']) 90 return [os.path.join(entry['directory'], entry['file'])
85 for entry in compile_db.Read(build_directory)] 91 for entry in compile_db.Read(build_directory)]
86 92
87 93
88 def _ExtractEditsFromStdout(build_directory, stdout):
89 """Extracts generated list of edits from the tool's stdout.
90
91 The expected format is documented at the top of this file.
92
93 Args:
94 build_directory: Directory that contains the compile database. Used to
95 normalize the filenames.
96 stdout: The stdout from running the clang tool.
97
98 Returns:
99 A dictionary mapping filenames to the associated edits.
100 """
101 lines = stdout.splitlines()
102 start_index = lines.index('==== BEGIN EDITS ====')
103 end_index = lines.index('==== END EDITS ====')
104 edits = collections.defaultdict(list)
105 for line in lines[start_index + 1:end_index]:
106 try:
107 edit_type, path, offset, length, replacement = line.split(':::', 4)
108 replacement = replacement.replace('\0', '\n')
109 # Normalize the file path emitted by the clang tool.
110 path = os.path.realpath(os.path.join(build_directory, path))
111 edits[path].append(Edit(edit_type, int(offset), int(length), replacement))
112 except ValueError:
113 print 'Unable to parse edit: %s' % line
114 return edits
115
116
117 def _ExecuteTool(toolname, tool_args, build_directory, filename): 94 def _ExecuteTool(toolname, tool_args, build_directory, filename):
118 """Executes the tool. 95 """Executes the clang tool.
119 96
120 This is defined outside the class so it can be pickled for the multiprocessing 97 This is defined outside the class so it can be pickled for the multiprocessing
121 module. 98 module.
122 99
123 Args: 100 Args:
124 toolname: Path to the tool to execute. 101 toolname: Name of the clang tool to execute.
125 tool_args: Arguments to be passed to the tool. Can be None. 102 tool_args: Arguments to be passed to the clang tool. Can be None.
126 build_directory: Directory that contains the compile database. 103 build_directory: Directory that contains the compile database.
127 filename: The file to run the tool over. 104 filename: The file to run the clang tool over.
128 105
129 Returns: 106 Returns:
130 A dictionary that must contain the key "status" and a boolean value 107 A dictionary that must contain the key "status" and a boolean value
131 associated with it. 108 associated with it.
132 109
133 If status is True, then the generated edits are stored with the key "edits" 110 If status is True, then the generated output is stored with the key
134 in the dictionary. 111 "stdout_text" in the dictionary.
135 112
136 Otherwise, the filename and the output from stderr are associated with the 113 Otherwise, the filename and the output from stderr are associated with the
137 keys "filename" and "stderr" respectively. 114 keys "filename" and "stderr_text" respectively.
138 """ 115 """
139 args = [toolname, '-p', build_directory, filename] 116 args = [toolname, '-p', build_directory, filename]
140 if (tool_args): 117 if (tool_args):
141 args.extend(tool_args) 118 args.extend(tool_args)
142 command = subprocess.Popen( 119 command = subprocess.Popen(
143 args, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 120 args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
144 stdout, stderr = command.communicate() 121 stdout_text, stderr_text = command.communicate()
145 if command.returncode != 0: 122 if command.returncode != 0:
146 return {'status': False, 'filename': filename, 'stderr': stderr} 123 return {'status': False, 'filename': filename, 'stderr_text': stderr_text}
147 else: 124 else:
148 return {'status': True, 125 return {'status': True, 'filename': filename, 'stdout_text': stdout_text}
149 'edits': _ExtractEditsFromStdout(build_directory, stdout)}
150 126
151 127
152 class _CompilerDispatcher(object): 128 class _CompilerDispatcher(object):
153 """Multiprocessing controller for running clang tools in parallel.""" 129 """Multiprocessing controller for running clang tools in parallel."""
154 130
155 def __init__(self, toolname, tool_args, build_directory, filenames): 131 def __init__(self, toolname, tool_args, build_directory, filenames):
156 """Initializer method. 132 """Initializer method.
157 133
158 Args: 134 Args:
159 toolname: Path to the tool to execute. 135 toolname: Path to the tool to execute.
160 tool_args: Arguments to be passed to the tool. Can be None. 136 tool_args: Arguments to be passed to the tool. Can be None.
161 build_directory: Directory that contains the compile database. 137 build_directory: Directory that contains the compile database.
162 filenames: The files to run the tool over. 138 filenames: The files to run the tool over.
163 """ 139 """
164 self.__toolname = toolname 140 self.__toolname = toolname
165 self.__tool_args = tool_args 141 self.__tool_args = tool_args
166 self.__build_directory = build_directory 142 self.__build_directory = build_directory
167 self.__filenames = filenames 143 self.__filenames = filenames
168 self.__success_count = 0 144 self.__success_count = 0
169 self.__failed_count = 0 145 self.__failed_count = 0
170 self.__edit_count = 0
171 self.__edits = collections.defaultdict(list)
172
173 @property
174 def edits(self):
175 return self.__edits
176 146
177 @property 147 @property
178 def failed_count(self): 148 def failed_count(self):
179 return self.__failed_count 149 return self.__failed_count
180 150
181 def Run(self): 151 def Run(self):
182 """Does the grunt work.""" 152 """Does the grunt work."""
183 pool = multiprocessing.Pool() 153 pool = multiprocessing.Pool()
184 result_iterator = pool.imap_unordered( 154 result_iterator = pool.imap_unordered(
185 functools.partial(_ExecuteTool, self.__toolname, self.__tool_args, 155 functools.partial(_ExecuteTool, self.__toolname, self.__tool_args,
186 self.__build_directory), 156 self.__build_directory),
187 self.__filenames) 157 self.__filenames)
188 for result in result_iterator: 158 for result in result_iterator:
189 self.__ProcessResult(result) 159 self.__ProcessResult(result)
190 sys.stdout.write('\n') 160 sys.stderr.write('\n')
191 sys.stdout.flush()
192 161
193 def __ProcessResult(self, result): 162 def __ProcessResult(self, result):
194 """Handles result processing. 163 """Handles result processing.
195 164
196 Args: 165 Args:
197 result: The result dictionary returned by _ExecuteTool. 166 result: The result dictionary returned by _ExecuteTool.
198 """ 167 """
199 if result['status']: 168 if result['status']:
200 self.__success_count += 1 169 self.__success_count += 1
201 for k, v in result['edits'].iteritems(): 170 sys.stdout.write(result['stdout_text'])
202 self.__edits[k].extend(v)
203 self.__edit_count += len(v)
204 else: 171 else:
205 self.__failed_count += 1 172 self.__failed_count += 1
206 sys.stdout.write('\nFailed to process %s\n' % result['filename']) 173 sys.stderr.write('\nFailed to process %s\n' % result['filename'])
207 sys.stdout.write(result['stderr']) 174 sys.stderr.write(result['stderr_text'])
208 sys.stdout.write('\n') 175 sys.stderr.write('\n')
209 percentage = (float(self.__success_count + self.__failed_count) / 176 done_count = self.__success_count + self.__failed_count
210 len(self.__filenames)) * 100 177 percentage = (float(done_count) / len(self.__filenames)) * 100
211 sys.stdout.write('Succeeded: %d, Failed: %d, Edits: %d [%.2f%%]\r' % 178 sys.stderr.write(
212 (self.__success_count, self.__failed_count, 179 'Processed %d files with %s tool (%d failures) [%.2f%%]\r' %
213 self.__edit_count, percentage)) 180 (done_count, self.__toolname, self.__failed_count, percentage))
214 sys.stdout.flush()
215
216
217 def _ApplyEdits(edits):
218 """Apply the generated edits.
219
220 Args:
221 edits: A dict mapping filenames to Edit instances that apply to that file.
222 """
223 edit_count = 0
224 for k, v in edits.iteritems():
225 # Sort the edits and iterate through them in reverse order. Sorting allows
226 # duplicate edits to be quickly skipped, while reversing means that
227 # subsequent edits don't need to have their offsets updated with each edit
228 # applied.
229 v.sort()
230 last_edit = None
231 with open(k, 'rb+') as f:
232 contents = bytearray(f.read())
233 for edit in reversed(v):
234 if edit == last_edit:
235 continue
236 last_edit = edit
237 contents[edit.offset:edit.offset + edit.length] = edit.replacement
238 if not edit.replacement:
239 _ExtendDeletionIfElementIsInList(contents, edit.offset)
240 edit_count += 1
241 f.seek(0)
242 f.truncate()
243 f.write(contents)
244 print 'Applied %d edits to %d files' % (edit_count, len(edits))
245
246
247 _WHITESPACE_BYTES = frozenset((ord('\t'), ord('\n'), ord('\r'), ord(' ')))
248
249
250 def _ExtendDeletionIfElementIsInList(contents, offset):
251 """Extends the range of a deletion if the deleted element was part of a list.
252
253 This rewriter helper makes it easy for refactoring tools to remove elements
254 from a list. Even if a matcher callback knows that it is removing an element
255 from a list, it may not have enough information to accurately remove the list
256 element; for example, another matcher callback may end up removing an adjacent
257 list element, or all the list elements may end up being removed.
258
259 With this helper, refactoring tools can simply remove the list element and not
260 worry about having to include the comma in the replacement.
261
262 Args:
263 contents: A bytearray with the deletion already applied.
264 offset: The offset in the bytearray where the deleted range used to be.
265 """
266 char_before = char_after = None
267 left_trim_count = 0
268 for byte in reversed(contents[:offset]):
269 left_trim_count += 1
270 if byte in _WHITESPACE_BYTES:
271 continue
272 if byte in (ord(','), ord(':'), ord('('), ord('{')):
273 char_before = chr(byte)
274 break
275
276 right_trim_count = 0
277 for byte in contents[offset:]:
278 right_trim_count += 1
279 if byte in _WHITESPACE_BYTES:
280 continue
281 if byte == ord(','):
282 char_after = chr(byte)
283 break
284
285 if char_before:
286 if char_after:
287 del contents[offset:offset + right_trim_count]
288 elif char_before in (',', ':'):
289 del contents[offset - left_trim_count:offset]
290 181
291 182
292 def main(): 183 def main():
293 parser = argparse.ArgumentParser() 184 parser = argparse.ArgumentParser()
294 parser.add_argument('tool', help='clang tool to run') 185 parser.add_argument('tool', help='clang tool to run')
295 parser.add_argument('--all', action='store_true') 186 parser.add_argument('--all', action='store_true')
296 parser.add_argument( 187 parser.add_argument(
297 '--generate-compdb', 188 '--generate-compdb',
298 action='store_true', 189 action='store_true',
299 help='regenerate the compile database before running the tool') 190 help='regenerate the compile database before running the tool')
(...skipping 12 matching lines...) Expand all
312 os.environ['PATH'] = '%s%s%s' % ( 203 os.environ['PATH'] = '%s%s%s' % (
313 os.path.abspath(os.path.join( 204 os.path.abspath(os.path.join(
314 os.path.dirname(__file__), 205 os.path.dirname(__file__),
315 '../../../third_party/llvm-build/Release+Asserts/bin')), 206 '../../../third_party/llvm-build/Release+Asserts/bin')),
316 os.pathsep, 207 os.pathsep,
317 os.environ['PATH']) 208 os.environ['PATH'])
318 209
319 if args.generate_compdb: 210 if args.generate_compdb:
320 compile_db.GenerateWithNinja(args.compile_database) 211 compile_db.GenerateWithNinja(args.compile_database)
321 212
322 filenames = set(_GetFilesFromGit(args.path_filter))
323 if args.all: 213 if args.all:
324 source_filenames = set(_GetFilesFromCompileDB(args.compile_database)) 214 source_filenames = set(_GetFilesFromCompileDB(args.compile_database))
325 else: 215 else:
216 git_filenames = set(_GetFilesFromGit(args.path_filter))
326 # Filter out files that aren't C/C++/Obj-C/Obj-C++. 217 # Filter out files that aren't C/C++/Obj-C/Obj-C++.
327 extensions = frozenset(('.c', '.cc', '.cpp', '.m', '.mm')) 218 extensions = frozenset(('.c', '.cc', '.cpp', '.m', '.mm'))
328 source_filenames = [f 219 source_filenames = [f
329 for f in filenames 220 for f in git_filenames
330 if os.path.splitext(f)[1] in extensions] 221 if os.path.splitext(f)[1] in extensions]
222
331 dispatcher = _CompilerDispatcher(args.tool, args.tool_args, 223 dispatcher = _CompilerDispatcher(args.tool, args.tool_args,
332 args.compile_database, 224 args.compile_database,
333 source_filenames) 225 source_filenames)
334 dispatcher.Run() 226 dispatcher.Run()
335 # Filter out edits to files that aren't in the git repository, since it's not
336 # useful to modify files that aren't under source control--typically, these
337 # are generated files or files in a git submodule that's not part of Chromium.
338 _ApplyEdits({k: v
339 for k, v in dispatcher.edits.iteritems()
340 if os.path.realpath(k) in filenames})
341 return -dispatcher.failed_count 227 return -dispatcher.failed_count
342 228
343 229
344 if __name__ == '__main__': 230 if __name__ == '__main__':
345 sys.exit(main()) 231 sys.exit(main())
OLDNEW
« no previous file with comments | « tools/clang/scripts/extract_edits.py ('k') | tools/clang/scripts/test_tool.py » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698