Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(260)

Side by Side Diff: courgette/description.html

Issue 23003015: Added documentation for Courgette internals. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Created 7 years, 4 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « courgette/application.png ('k') | courgette/description.md » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 <h1>Courgette Internals</h1>
2
3 <h2>Patch Generation</h2>
4
5 <p><img src="generation.png" alt="Patch Generation" title="" /></p>
6
7 <ul>
8 <li><p>courgette_tool.cc:GenerateEnsemblePatch kicks off the patch
9 generation by calling ensemble_create.cc:GenerateEnsemblePatch</p></li>
10 <li><p>The files are read in by in courgette:SourceStream objects</p></li>
11 <li><p>ensemble_create.cc:GenerateEnsemblePatch uses FindGenerators, which
12 uses MakeGenerator to create
13 patch_generator_x86_32.h:PatchGeneratorX86_32 classes.</p></li>
14 <li><p>PatchGeneratorX86_32's Transform method transforms the input file
15 using Courgette's core techniques that make the bsdiff delta
16 smaller. The steps it takes are the following:</p>
17
18 <ul>
19 <li><p><em>disassemble</em> the old and new binaries into AssemblyProgram
20 objects,</p></li>
21 <li><p><em>adjust</em> the new AssemblyProgram object, and</p></li>
22 <li><p><em>encode</em> the AssemblyProgram object back into raw bytes.</p></li>
23 </ul></li>
24 </ul>
25
26 <h3>Disassemble</h3>
27
28 <ul>
29 <li><p>The input is a pointer to a buffer containing the raw bytes of the
30 input file.</p></li>
31 <li><p>Disassembly converts certain machine instructions that reference
32 addresses to Courgette instructions. It is not actually
33 disassembly, but this is the term the code-base uses. Specifically,
34 it detects instructions that use absolute addresses given by the
35 binary file's relocation table, and relative addresses used in
36 relative branches.</p></li>
37 <li><p>Done by disassemble:ParseDetectedExecutable, which selects the
38 appropriate Disassembler subclass by looking at the binary file's
39 headers.</p>
40
41 <ul>
42 <li><p>disassembler_win32_x86.h defines the PE/COFF x86 disassembler</p></li>
43 <li><p>disassembler_elf_32_x86.h defines the ELF 32-bit x86 disassembler</p></li >
44 <li><p>disassembler_elf_32_arm.h defines the ELF 32-bit arm disassembler</p></li >
45 </ul></li>
46 <li><p>The Disassembler replaces the relocation table with a Courgette
47 instruction that can regenerate the relocation table.</p></li>
48 <li><p>The Disassembler builds a list of addresses referenced by the
49 machine code, numbering each one.</p></li>
50 <li><p>The Disassembler replaces and address used in machine instructions
51 with its index number.</p></li>
52 <li><p>The output is an assembly_program.h:AssemblyProgram class, which
53 contains a list of instructions, machine or Courgette, and a mapping
54 of indices to actual addresses.</p></li>
55 </ul>
56
57 <h3>Adjust</h3>
58
59 <ul>
60 <li><p>This step takes the AssemblyProgram for the old file and reassigns
61 the indices that map to actual addresses. It is performed by
62 adjustment_method.cc:Adjust().</p></li>
63 <li><p>The goal is the match the indices from the old program to the new
64 program as closely as possible.</p></li>
65 <li><p>When matched correctly, machine instructions that jump to the
66 function in both the new and old binary will look the same to
67 bsdiff, even the function is located in a different part of the
68 binary.</p></li>
69 </ul>
70
71 <h3>Encode</h3>
72
73 <ul>
74 <li><p>This step takes an AssemblyProgram object and encodes both the
75 instructions and the mapping of indices to addresses as byte
76 vectors. This format can be written to a file directly, and is also
77 more appropriate for bsdiffing. It is done by
78 AssemblyProgram.Encode().</p></li>
79 <li><p>encoded_program.h:EncodedProgram defines the binary format and a
80 WriteTo method that writes to a file.</p></li>
81 </ul>
82
83 <h3>bsdiff</h3>
84
85 <ul>
86 <li>simple_delta.c:GenerateSimpleDelta</li>
87 </ul>
88
89 <h2>Patch Application</h2>
90
91 <p><img src="application.png" alt="Patch Application" title="" /></p>
92
93 <ul>
94 <li><p>courgette_tool.cc:ApplyEnsemblePatch kicks off the patch generation
95 by calling ensemble_apply.cc:ApplyEnsemblePatch</p></li>
96 <li><p>ensemble_create.cc:ApplyEnsemblePatch, reads and verifies the
97 patch's header, then calls the overloaded version of
98 ensemble_create.cc:ApplyEnsemblePatch.</p></li>
99 <li><p>The patch is read into an ensemble<em>apply.cc:EnsemblePatchApplication
100 object, which generates a set of patcher</em>x86<em>32.h:PatcherX86</em>32
101 objects for the sections in the patch.</p></li>
102 <li><p>The original file is disassembled and encoded via a call
103 EnsemblePatchApplication.TransformUp, which in turn call
104 patcher<em>x86</em>32.h:PatcherX86_32.Transform.</p></li>
105 <li><p>The transformed file is then bspatched via
106 EnsemblePatchApplication.SubpatchTransformedElements, which calls
107 EnsemblePatchApplication.SubpatchStreamSets, which calls
108 simple_delta.cc:ApplySimpleDelta, Courgette's built-in
109 implementation of bspatch.</p></li>
110 <li><p>Finally, EnsemblePatchApplication.TransformDown assembles, i.e.,
111 reverses the encoding and disassembly, on the patched binary data.
112 This is done by calling PatcherX86<em>32.Reform, which in turn calls
113 the global function encoded</em>program.cc:Assemble, which calls
114 EncodedProgram.AssembleTo.</p></li>
115 </ul>
116
117 <h2>Glossary</h2>
118
119 <p><strong>Adjust</strong>: Reassign address indices in the new program to match more
Ben Chan 2013/08/20 16:50:40 not sure if we care about the 80 char limit here.
dgarrett 2013/08/21 19:03:46 I'm guessing not, especially if the file is genera
120 closely those from the old.</p>
121
122 <p><strong>Assembly program</strong>: The output of <em>disassembly</em>. Conta ins a list of
123 <em>Courgette instructions</em> and an index of branch target addresses.</p>
124
125 <p><strong>Assemble</strong>: Convert an <em>assembly program</em> back into an object file
126 by evaluating the <em>Courgette instructions</em> and leaving the machine
127 instructions in place.</p>
128
129 <p><strong>Courgette instruction</strong>: Replaces machine instructions in the
130 program. Courgette instructions replace branches with an index to
131 the target addresses and replace part of the relocation table.</p>
132
133 <p><strong>Disassembler</strong>: Takes a binary file and produces an <em>assemb ly
134 program</em>.</p>
135
136 <p><strong>Encode</strong>: Convert an <em>assembly program</em> into an <em>enc oded program</em> by
137 serializing its data structures into byte vectors more appropriate
138 for storage in a file.</p>
139
140 <p><strong>Encoded Program</strong>: The output of encoding.</p>
141
142 <p><strong>Ensemble</strong>: A Courgette-style patch containing sections for th e list
143 of branch addresses, the encoded program. It supports patching
144 multiple object files at once.</p>
145
146 <p><strong>Opcode</strong>: The number corresponding to either a machine or <em> Courgette
147 instruction</em>.</p>
OLDNEW
« no previous file with comments | « courgette/application.png ('k') | courgette/description.md » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698