tools/clang/scripts/apply_edits.py - Issue 2599193002: Split run_tool.py into run_tool.py, extract_edits.py and apply_edits.py

Side by Side Diff: tools/clang/scripts/apply_edits.py

Issue 2599193002: Split run_tool.py into run_tool.py, extract_edits.py and apply_edits.py (Closed)

Patch Set: Addressed CR feedback from dcheng@ and danakj@. Created 3 years, 11 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

OLD	NEW
1 #!/usr/bin/env python	1 #!/usr/bin/env python

2 # Copyright (c) 2013 The Chromium Authors. All rights reserved.	2 # Copyright (c) 2013 The Chromium Authors. All rights reserved.

3 # Use of this source code is governed by a BSD-style license that can be	3 # Use of this source code is governed by a BSD-style license that can be

4 # found in the LICENSE file.	4 # found in the LICENSE file.

5 """Wrapper script to help run clang tools across Chromium code.	5 """Applies edits generated by a clang tool that was run on Chromium code.

6	6

7 How to use this tool:	7 Synopsis:

8 If you want to run the tool across all Chromium code:

9 run_tool.py <tool> <path/to/compiledb>

10	8

11 If you want to include all files mentioned in the compilation database:	9 cat run_tool.out \| extract_edits.py \| apply_edits.py <build dir> <filters...>

12 run_tool.py <tool> <path/to/compiledb> --all

13	10

14 If you only want to run the tool across just chrome/browser and content/browser:	11 For example - to apply edits only to WTF sources:

15 run_tool.py <tool> <path/to/compiledb> chrome/browser content/browser

16	12

17 Please see https://chromium.googlesource.com/chromium/src/+/master/docs/clang_to ol_refactoring.md for more	13 ... \| apply_edits.py out/gn third_party/WebKit/Source/wtf

18 information, which documents the entire automated refactoring flow in Chromium.

19	14

20 Why use this tool:	15 In addition to filters specified on the command line, the tool also skips edits

21 The clang tool implementation doesn't take advantage of multiple cores, and if	16 that apply to files that are not covered by git.

22 it fails mysteriously in the middle, all the generated replacements will be

23 lost.

24

25 Unfortunately, if the work is simply sharded across multiple cores by running

26 multiple RefactoringTools, problems arise when they attempt to rewrite a file at

27 the same time. To work around that, clang tools that are run using this tool

28 should output edits to stdout in the following format:

29

30 ==== BEGIN EDITS ====

31 r:<file path>:<offset>:<length>:<replacement text>

32 r:<file path>:<offset>:<length>:<replacement text>

33 ...etc...

34 ==== END EDITS ====

35

36 Any generated edits are applied once the clang tool has finished running

37 across Chromium, regardless of whether some instances failed or not.

38 """	17 """

39	18

40 import argparse	19 import argparse

41 import collections	20 import collections

42 import functools	21 import functools

43 import multiprocessing	22 import multiprocessing

44 import os	23 import os

45 import os.path	24 import os.path

46 import subprocess	25 import subprocess

47 import sys	26 import sys

(...skipping 20 matching lines...) Expand all Loading...
68 else:	47 else:

69 args.append('git')	48 args.append('git')

70 args.append('ls-files')	49 args.append('ls-files')

71 if paths:	50 if paths:

72 args.extend(paths)	51 args.extend(paths)

73 command = subprocess.Popen(args, stdout=subprocess.PIPE)	52 command = subprocess.Popen(args, stdout=subprocess.PIPE)

74 output, _ = command.communicate()	53 output, _ = command.communicate()

75 return [os.path.realpath(p) for p in output.splitlines()]	54 return [os.path.realpath(p) for p in output.splitlines()]

76	55

77	56

78 def _GetFilesFromCompileDB(build_directory):	57 def _ParseEditsFromStdin(build_directory):

79 """ Gets the list of files mentioned in the compilation database.

80

81 Args:

82 build_directory: Directory that contains the compile database.

83 """

84 return [os.path.join(entry['directory'], entry['file'])

85 for entry in compile_db.Read(build_directory)]

86

87

88 def _ExtractEditsFromStdout(build_directory, stdout):

89 """Extracts generated list of edits from the tool's stdout.	58 """Extracts generated list of edits from the tool's stdout.

90	59

91 The expected format is documented at the top of this file.	60 The expected format is documented at the top of this file.

92	61

93 Args:	62 Args:

94 build_directory: Directory that contains the compile database. Used to	63 build_directory: Directory that contains the compile database. Used to

95 normalize the filenames.	64 normalize the filenames.

96 stdout: The stdout from running the clang tool.	65 stdout: The stdout from running the clang tool.

97	66

98 Returns:	67 Returns:

99 A dictionary mapping filenames to the associated edits.	68 A dictionary mapping filenames to the associated edits.

100 """	69 """

101 lines = stdout.splitlines()	70 path_to_resolved_path = dict()
	dcheng 2016/12/28 19:01:37 Nit: = {} for consistency Nit: = {} for consistency Łukasz Anforowicz 2016/12/28 19:33:07 Done. Show quoted text On 2016/12/28 19:01:37, dcheng wrote: > Nit: = {} > for consistency Done.
102 start_index = lines.index('==== BEGIN EDITS ====')	71 def _ResolvePath(path):

103 end_index = lines.index('==== END EDITS ====')	72 if path in path_to_resolved_path:

	73 return path_to_resolved_path[path]

	74

	75 if not os.path.isfile(path):

	76 resolved_path = os.path.realpath(os.path.join(build_directory, path))

	77 else:

	78 resolved_path = path

	79

	80 if not os.path.isfile(resolved_path):

	81 sys.stderr.write('Edit applies to a non-existant file: %s\n' % path)
	dcheng 2016/12/28 19:01:37 Nit: existent Nit: existent Łukasz Anforowicz 2016/12/28 19:33:07 Done. Show quoted text On 2016/12/28 19:01:37, dcheng wrote: > Nit: existent Done.
	82 resolved_path = None

	83

	84 path_to_resolved_path[path] = resolved_path

	85 return resolved_path

	86

104 edits = collections.defaultdict(list)	87 edits = collections.defaultdict(list)

105 for line in lines[start_index + 1:end_index]:	88 for line in sys.stdin:

	89 line = line.rstrip("\n\r")

106 try:	90 try:

107 edit_type, path, offset, length, replacement = line.split(':::', 4)	91 edit_type, path, offset, length, replacement = line.split(':::', 4)

108 replacement = replacement.replace('\0', '\n')	92 replacement = replacement.replace('\0', '\n')

109 # Normalize the file path emitted by the clang tool.	93 path = _ResolvePath(path)

110 path = os.path.realpath(os.path.join(build_directory, path))	94 if not path: continue

111 edits[path].append(Edit(edit_type, int(offset), int(length), replacement))	95 edits[path].append(Edit(edit_type, int(offset), int(length), replacement))

112 except ValueError:	96 except ValueError:

113 print 'Unable to parse edit: %s' % line	97 sys.stderr.write('Unable to parse edit: %s\n' % line)

114 return edits	98 return edits

115	99

116	100

117 def _ExecuteTool(toolname, tool_args, build_directory, filename):	101 def _ApplyEditsToSingleFile(filename, edits):

118 """Executes the tool.	102 # Sort the edits and iterate through them in reverse order. Sorting allows

	103 # duplicate edits to be quickly skipped, while reversing means that

	104 # subsequent edits don't need to have their offsets updated with each edit

	105 # applied.

	106 edit_count = 0

	107 error_count = 0

	108 edits.sort()

	109 last_edit = None

	110 with open(filename, 'rb+') as f:

	111 contents = bytearray(f.read())

	112 for edit in reversed(edits):

	113 if edit == last_edit:

	114 continue

	115 if (last_edit is not None and edit.edit_type == last_edit.edit_type and

	116 edit.offset == last_edit.offset and edit.length == last_edit.length):

	117 sys.stderr.write(

	118 'Conflicting edit: %s at offset %d, length %d: "%s" != "%s"\n' %

	119 (filename, edit.offset, edit.length, edit.replacement,

	120 last_edit.replacement))

	121 error_count += 1

	122 continue

119	123

120 This is defined outside the class so it can be pickled for the multiprocessing	124 last_edit = edit

121 module.	125 contents[edit.offset:edit.offset + edit.length] = edit.replacement

122	126 if not edit.replacement:

123 Args:	127 _ExtendDeletionIfElementIsInList(contents, edit.offset)

124 toolname: Path to the tool to execute.	128 edit_count += 1

125 tool_args: Arguments to be passed to the tool. Can be None.	129 f.seek(0)

126 build_directory: Directory that contains the compile database.	130 f.truncate()

127 filename: The file to run the tool over.	131 f.write(contents)

128	132 return (edit_count, error_count)

129 Returns:

130 A dictionary that must contain the key "status" and a boolean value

131 associated with it.

132

133 If status is True, then the generated edits are stored with the key "edits"

134 in the dictionary.

135

136 Otherwise, the filename and the output from stderr are associated with the

137 keys "filename" and "stderr" respectively.

138 """

139 args = [toolname, '-p', build_directory, filename]

140 if (tool_args):

141 args.extend(tool_args)

142 command = subprocess.Popen(

143 args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

144 stdout, stderr = command.communicate()

145 if command.returncode != 0:

146 return {'status': False, 'filename': filename, 'stderr': stderr}

147 else:

148 return {'status': True,

149 'edits': _ExtractEditsFromStdout(build_directory, stdout)}

150

151

152 class _CompilerDispatcher(object):

153 """Multiprocessing controller for running clang tools in parallel."""

154

155 def __init__(self, toolname, tool_args, build_directory, filenames):

156 """Initializer method.

157

158 Args:

159 toolname: Path to the tool to execute.

160 tool_args: Arguments to be passed to the tool. Can be None.

161 build_directory: Directory that contains the compile database.

162 filenames: The files to run the tool over.

163 """

164 self.__toolname = toolname

165 self.__tool_args = tool_args

166 self.__build_directory = build_directory

167 self.__filenames = filenames

168 self.__success_count = 0

169 self.__failed_count = 0

170 self.__edit_count = 0

171 self.__edits = collections.defaultdict(list)

172

173 @property

174 def edits(self):

175 return self.__edits

176

177 @property

178 def failed_count(self):

179 return self.__failed_count

180

181 def Run(self):

182 """Does the grunt work."""

183 pool = multiprocessing.Pool()

184 result_iterator = pool.imap_unordered(

185 functools.partial(_ExecuteTool, self.__toolname, self.__tool_args,

186 self.__build_directory),

187 self.__filenames)

188 for result in result_iterator:

189 self.__ProcessResult(result)

190 sys.stdout.write('\n')

191 sys.stdout.flush()

192

193 def __ProcessResult(self, result):

194 """Handles result processing.

195

196 Args:

197 result: The result dictionary returned by _ExecuteTool.

198 """

199 if result['status']:

200 self.__success_count += 1

201 for k, v in result['edits'].iteritems():

202 self.__edits[k].extend(v)

203 self.__edit_count += len(v)

204 else:

205 self.__failed_count += 1

206 sys.stdout.write('\nFailed to process %s\n' % result['filename'])

207 sys.stdout.write(result['stderr'])

208 sys.stdout.write('\n')

209 percentage = (float(self.__success_count + self.__failed_count) /

210 len(self.__filenames)) * 100

211 sys.stdout.write('Succeeded: %d, Failed: %d, Edits: %d [%.2f%%]\r' %

212 (self.__success_count, self.__failed_count,

213 self.__edit_count, percentage))

214 sys.stdout.flush()

215	133

216	134

217 def _ApplyEdits(edits):	135 def _ApplyEdits(edits):

218 """Apply the generated edits.	136 """Apply the generated edits.

219	137

220 Args:	138 Args:

221 edits: A dict mapping filenames to Edit instances that apply to that file.	139 edits: A dict mapping filenames to Edit instances that apply to that file.

222 """	140 """

223 edit_count = 0	141 edit_count = 0

	142 error_count = 0

	143 done_files = 0

224 for k, v in edits.iteritems():	144 for k, v in edits.iteritems():

225 # Sort the edits and iterate through them in reverse order. Sorting allows	145 tmp_edit_count, tmp_error_count = _ApplyEditsToSingleFile(k, v)

226 # duplicate edits to be quickly skipped, while reversing means that	146 edit_count += tmp_edit_count

227 # subsequent edits don't need to have their offsets updated with each edit	147 error_count += tmp_error_count

228 # applied.	148 done_files += 1

229 v.sort()	149 percentage = (float(done_files) / len(edits)) * 100

230 last_edit = None	150 sys.stderr.write('Applied %d edits (%d errors) to %d files [%.2f%%]\r' %

231 with open(k, 'rb+') as f:	151 (edit_count, error_count, done_files, percentage))

232 contents = bytearray(f.read())	152

233 for edit in reversed(v):	153 sys.stderr.write('\n')

234 if edit == last_edit:	154 return -error_count

235 continue

236 last_edit = edit

237 contents[edit.offset:edit.offset + edit.length] = edit.replacement

238 if not edit.replacement:

239 _ExtendDeletionIfElementIsInList(contents, edit.offset)

240 edit_count += 1

241 f.seek(0)

242 f.truncate()

243 f.write(contents)

244 print 'Applied %d edits to %d files' % (edit_count, len(edits))

245	155

246	156

247 _WHITESPACE_BYTES = frozenset((ord('\t'), ord('\n'), ord('\r'), ord(' ')))	157 _WHITESPACE_BYTES = frozenset((ord('\t'), ord('\n'), ord('\r'), ord(' ')))

248	158

249	159

250 def _ExtendDeletionIfElementIsInList(contents, offset):	160 def _ExtendDeletionIfElementIsInList(contents, offset):

251 """Extends the range of a deletion if the deleted element was part of a list.	161 """Extends the range of a deletion if the deleted element was part of a list.

252	162

253 This rewriter helper makes it easy for refactoring tools to remove elements	163 This rewriter helper makes it easy for refactoring tools to remove elements

254 from a list. Even if a matcher callback knows that it is removing an element	164 from a list. Even if a matcher callback knows that it is removing an element

(...skipping 29 matching lines...) Expand all Loading...
284	194

285 if char_before:	195 if char_before:

286 if char_after:	196 if char_after:

287 del contents[offset:offset + right_trim_count]	197 del contents[offset:offset + right_trim_count]

288 elif char_before in (',', ':'):	198 elif char_before in (',', ':'):

289 del contents[offset - left_trim_count:offset]	199 del contents[offset - left_trim_count:offset]

290	200

291	201

292 def main():	202 def main():

293 parser = argparse.ArgumentParser()	203 parser = argparse.ArgumentParser()

294 parser.add_argument('tool', help='clang tool to run')

295 parser.add_argument('--all', action='store_true')

296 parser.add_argument(	204 parser.add_argument(

297 '--generate-compdb',	205 'build_directory',

298 action='store_true',	206 help='path to the build dir (dir that edit paths are relative to)')

299 help='regenerate the compile database before running the tool')

300 parser.add_argument(

301 'compile_database',

302 help='path to the directory that contains the compile database')

303 parser.add_argument(	207 parser.add_argument(

304 'path_filter',	208 'path_filter',

305 nargs='*',	209 nargs='*',

306 help='optional paths to filter what files the tool is run on')	210 help='optional paths to filter what files the tool is run on')

307 parser.add_argument(

308 '--tool-args', nargs='*',

309 help='optional arguments passed to the tool')

310 args = parser.parse_args()	211 args = parser.parse_args()

311	212

312 os.environ['PATH'] = '%s%s%s' % (

313 os.path.abspath(os.path.join(

314 os.path.dirname(__file__),

315 '../../../third_party/llvm-build/Release+Asserts/bin')),

316 os.pathsep,

317 os.environ['PATH'])

318

319 if args.generate_compdb:

320 compile_db.GenerateWithNinja(args.compile_database)

321

322 filenames = set(_GetFilesFromGit(args.path_filter))	213 filenames = set(_GetFilesFromGit(args.path_filter))

323 if args.all:	214 edits = _ParseEditsFromStdin(args.build_directory)

324 source_filenames = set(_GetFilesFromCompileDB(args.compile_database))	215 return _ApplyEdits(

325 else:	216 {k: v for k, v in edits.iteritems()

326 # Filter out files that aren't C/C++/Obj-C/Obj-C++.	217 if os.path.realpath(k) in filenames})

327 extensions = frozenset(('.c', '.cc', '.cpp', '.m', '.mm'))

328 source_filenames = [f

329 for f in filenames

330 if os.path.splitext(f)[1] in extensions]

331 dispatcher = _CompilerDispatcher(args.tool, args.tool_args,

332 args.compile_database,

333 source_filenames)

334 dispatcher.Run()

335 # Filter out edits to files that aren't in the git repository, since it's not

336 # useful to modify files that aren't under source control--typically, these

337 # are generated files or files in a git submodule that's not part of Chromium.

338 _ApplyEdits({k: v

339 for k, v in dispatcher.edits.iteritems()

340 if os.path.realpath(k) in filenames})

341 return -dispatcher.failed_count

342	218

343	219

344 if __name__ == '__main__':	220 if __name__ == '__main__':

345 sys.exit(main())	221 sys.exit(main())

OLD	NEW

« docs/clang_tool_refactoring.md ('K') | « docs/clang_tool_refactoring.md ('k') | tools/clang/scripts/extract_edits.py » ('j') | tools/clang/scripts/run_tool.py » ('J')