|
|
Chromium Code Reviews|
Created:
6 years, 10 months ago by Karl Modified:
6 years, 10 months ago CC:
native-client-reviews_googlegroups.com Base URL:
http://git.chromium.org/native_client/pnacl-llvm.git@master Visibility:
Public. |
DescriptionMake pnacl-bccompress add abbreviations for obvious constants.
Updates pnacl-bccompress to add abbreviations by inserting
constants into existing abbreviations, when the generated
distribution maps show that a particular record element, the
same value is used in a large number of records.
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3774
R=jvoung@chromium.org
Committed: https://gerrit.chromium.org/gerrit/gitweb?p=native_client/pnacl-llvm.git;a=commit;h=cc1b8c7
Patch Set 1 #
Total comments: 45
Patch Set 2 : Fix issues for patch set 1. #
Total comments: 16
Patch Set 3 : Fix issues raised in Patch Set 1 and 2. #
Total comments: 10
Patch Set 4 : Fix issues raised in Patch Set 3. #
Total comments: 4
Patch Set 5 : Fixs nits associated with Patch Set 4. #
Messages
Total messages: 11 (0 generated)
PTAL. Thanks.
https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:103: // Prints out the abbreviation in a readable form to the given stream. document return bool value? https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:235: // abbreviation. what goes wrong if the abbrev isn't like the default? https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:291: /// Updates Index to the index where the abbreviation is located The documentation for "Updates Index" doesn't seem useful here, since this is a wrapper that will ignore it. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:524: /// Finds the index to the corresponding internal block abbrevation abb https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:645: AbbrevOp = Abbrev->getOperandInfo(NextOp+1); How come NextOp isn't bumped for Array? https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:728: return A1.Compare(A2); shouldn't this check if Compare returns -1 or something? https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:732: /// before we will even consider it a potential candidate abbrevation. abb https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:737: /// abbrevaition. This filter is done to only keep candidates that abbreviation https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:775: /// associated with this set of candidate abbreviations. This to adjust the lower bound set by "MinNumInstancesForNewAbbrevs" so that there is a proportional component too? https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:816: NaClBitCodeAbbrev *Abbrev = UnrolledAbbrev.roll(); Is Unroll() and roll() and lossy because of the cutoff? Would the test that it doesn't *already* exist in BlockAbbrevsMap[] then fail, and it could proceed to be added? What happens then? https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:825: // If reached, we should add the given abbrevation to the set of abbreviation https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:858: // have added new candidate abbrevations to CandAbbrevs. abbreviations https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:893: // read with the given abbreviation Abbrev. IndexDist is the Any idea how IndexDist looks like? I would think that most instructions are the same arity, so IndexDist would be pretty flat? https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:920: // Look for new abbrevaitions in block BlockID, considering it was abbreviations https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:954: // read (globally defined) abbreviations. AbbrevDist is the Abbrevs is already known to be associated with the BlockID, right? https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:969: NaClBitCodeAbbrev *Abbrev = Abbrevs->GetIndexedAbbrev(AbbrevIndex); It seems like an *existing* Abbrev gets pulled out of Abbrevs. Then the long chain of AddNewAbbreviations() seems to modify that these given Abbrevs. Is it via Unroll and Roll that you end up cloning those existing Abbrevs as templates for new Abbrevs? https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:994: // abbrevations. Any found abbreviations are added to the set of abb. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:1171: // dumped BlockInfoBlock. So the BlockInfoBlock will be skipped?
https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:103: // Prints out the abbreviation in a readable form to the given stream. On 2014/02/05 00:24:16, jvoung (cr) wrote: > document return bool value? Removed the need for this by adding method NumArguments to class NaClBitCodeAbbrevOp. Also, moved this (and PrintAbbrev below) to be methods as well. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:235: // abbreviation. On 2014/02/05 00:24:16, jvoung (cr) wrote: > what goes wrong if the abbrev isn't like the default? The main reason for doing this is so that we don't need to treat records that use the default abbreviation as a special case. Rather, it will just using the default abbreviation defined here. The other non-application abbreviations do not generate records. Rather, we are only adding an abbreviation as a filler so that we don't need to adjust abbreviation indices. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:291: /// Updates Index to the index where the abbreviation is located On 2014/02/05 00:24:16, jvoung (cr) wrote: > The documentation for "Updates Index" doesn't seem useful here, since this is a > wrapper that will ignore it. Sorry, I really should be more careful when cutting/pasting things. Removing comments about index. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:524: /// Finds the index to the corresponding internal block abbrevation On 2014/02/05 00:24:16, jvoung (cr) wrote: > abb Fixed spelling of abbreviation. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:645: AbbrevOp = Abbrev->getOperandInfo(NextOp+1); On 2014/02/05 00:24:16, jvoung (cr) wrote: > How come NextOp isn't bumped for Array? When array appears in an abbreviation expression, the argument (i.e. next argument) is used to match all remaining elements. Hence, we do not move forward. Adding a comment to make this more clear. Also modified this class to have MoreOp field to hold the abbreviation of the array, since we want to use the provided abbreviation (rather than the default) when rolling. Further, modified the method NaClBitCodeAbbrev::Add to "simplify the abbreviation expression, so that redundant abbreviations of the form 1) VBR(6) Array(VBR(6)) 2) Array(VBR(6)) are simplified to the second form. This doesn't cause a problem for existing bitcode files, since it never generated case 1). On the other hand, it simplifies the code here (and abbreviation comparison) to not have to worry about generating case 1. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:660: class CandBlockAbbrev { Realized that putting NumInstances in this class was a mistake. Rather, what we really want is a map from this class (minus the NumInstancs) to the corresponding NumInstances. This way, we can update the number of intances if it is added more than once. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:728: return A1.Compare(A2); On 2014/02/05 00:24:16, jvoung (cr) wrote: > shouldn't this check if Compare returns -1 or something? My bad. Fixing. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:732: /// before we will even consider it a potential candidate abbrevation. On 2014/02/05 00:24:16, jvoung (cr) wrote: > abb Done. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:737: /// abbrevaition. This filter is done to only keep candidates that On 2014/02/05 00:24:16, jvoung (cr) wrote: > abbreviation Done. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:739: class CandidateAbbrevs { Modified this class. In particular, modified that the Abbreviations are stored as a map CandAbbrev -> NumInstances, so that we can update the number of instances should the same candidate abbreviation be added more than once. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:775: /// associated with this set of candidate abbreviations. On 2014/02/05 00:24:16, jvoung (cr) wrote: > This to adjust the lower bound set by "MinNumInstancesForNewAbbrevs" so that > there is a proportional component too? Yes. We set the maximum based on the number of bits, and set the lower bound to (roughly) be proportional to a good size chunk. This allows more abbreviations to be added in a single pass, at the cost of not necessarily getting the best compression. This is a simple heuristic to get us going. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:816: NaClBitCodeAbbrev *Abbrev = UnrolledAbbrev.roll(); On 2014/02/05 00:24:16, jvoung (cr) wrote: > Is Unroll() and roll() and lossy because of the cutoff? > > Would the test that it doesn't *already* exist in BlockAbbrevsMap[] then fail, > and it could proceed to be added? What happens then? Fixing the code for class UnrolledAbbreviation so that the operations are not lossy. As far as the test below, it mainly is to guarantee that we don't introduce duplicate abbreviations, which is never a good thing. However, it doesn't check if the abbreviation will actually be used. A planned improvement is to add a verification step that removes an abbreviation that isn't applicable anymore. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:825: // If reached, we should add the given abbrevation to the set of On 2014/02/05 00:24:16, jvoung (cr) wrote: > abbreviation Done. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:858: // have added new candidate abbrevations to CandAbbrevs. On 2014/02/05 00:24:16, jvoung (cr) wrote: > abbreviations Done. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:893: // read with the given abbreviation Abbrev. IndexDist is the On 2014/02/05 00:24:16, jvoung (cr) wrote: > Any idea how IndexDist looks like? > > I would think that most instructions are the same arity, so IndexDist would be > pretty flat? Not sure what is being asked here. First, the (bccompress) distribution maps separate on code and size first. Hence, this call will always be for the same instruction of the same length. At this step, we want to improve on the unrolled abbreviation for each index (in the index distribution map). Note that while the number of instances are flat, as you suggest, the importance is NOT based on this concept. Rather, it sorts them based on the distribution of values for the index, skewing values with large indices to the beginning of the (sorted) distribution. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:920: // Look for new abbrevaitions in block BlockID, considering it was On 2014/02/05 00:24:16, jvoung (cr) wrote: > abbreviations Done. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:954: // read (globally defined) abbreviations. AbbrevDist is the On 2014/02/05 00:24:16, jvoung (cr) wrote: > Abbrevs is already known to be associated with the BlockID, right? yes. Added qualifier to comment. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:969: NaClBitCodeAbbrev *Abbrev = Abbrevs->GetIndexedAbbrev(AbbrevIndex); On 2014/02/05 00:24:16, jvoung (cr) wrote: > It seems like an *existing* Abbrev gets pulled out of Abbrevs. > > Then the long chain of AddNewAbbreviations() seems to modify that these given > Abbrevs. > > Is it via Unroll and Roll that you end up cloning those existing Abbrevs as > templates for new Abbrevs? Yes. Unroll creates a copy of the abbreviation, and the unrolled version is the one that is updated (and eventually rolled back into a new abbreviation). The main purpose of using an unrolled abbreviation is that we can change elements (which isn't allowed in class NaClBitcodeAbbrev) based on the model of the distribution maps. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:994: // abbrevations. Any found abbreviations are added to the set of On 2014/02/05 00:24:16, jvoung (cr) wrote: > abb. Done. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:1013: // Install candidate abbreviations. Modified this code to sort the abbreviations by the number of instances, so that if multiple abbreviations apply to the record, the abbreviation with the largest number of instances will be chosen first. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:1062: AddNewAbbreviations(Parser.BlockDist, BlockAbbrevsMap, CandAbbrevs); After reading the code more carefully, I realized that CandAbbrevs should be generated in the called AddNewAbbreviation, since it is local to that function. Hence, moving the code. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:1171: // dumped BlockInfoBlock. On 2014/02/05 00:24:16, jvoung (cr) wrote: > So the BlockInfoBlock will be skipped? No. There are generated by method ExitBlockInfo below. Remember that the bitstream reader processes the BlockInfo block internally, and is never passed to the bitcode readers. Hence, the method ExitBlockInfo was added to allow processing of that internally collected information. However, we do make sure that we don't generate global module abbreviations, since they were inserted here. In addition, it would be difficult to modify the bitstream reader to automatically export global abbreviations as the are read. In particular, doing so would cause problems for blocks that have already defined local abbreviations (i.e. it would change the abbreviation index dynamically and code would have to be aware of this change.
partial review -- more later https://codereview.chromium.org/154603002/diff/60001/include/llvm/Bitcode/NaC... File include/llvm/Bitcode/NaCl/NaClBitCodes.h (right): https://codereview.chromium.org/154603002/diff/60001/include/llvm/Bitcode/NaC... include/llvm/Bitcode/NaCl/NaClBitCodes.h:158: if (isEncoding() && (Encoding)Enc == Array) same as isArrayOp() ? https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... File lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp (right): https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:41: Stream << "??"; This shouldn't be possible right? https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:45: Stream << "??"; Should be impossible? https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:87: && Abbrev->OperandList.back().isArrayOp() How can the NaClBitCodeAbbrevOp::Array be the last thing in the list of operands? Doesn't some other encoding op need to show up after, to know the type of the array elements? You seem to be looking for the type of the array element before the NaClBitCodeAbbrevOp::Array instead. Also, there isn't a restriction on where the first "Op" in the sequence "Op Array(Op)" comes, since you are starting from index 0? Or am I understanding this wrong? Does this come up because of something bccompress does? It doesn't happen in the default set of abbrevs right?
https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:645: AbbrevOp = Abbrev->getOperandInfo(NextOp+1); On 2014/02/06 19:46:28, Karl wrote: > On 2014/02/05 00:24:16, jvoung (cr) wrote: > > How come NextOp isn't bumped for Array? > > When array appears in an abbreviation expression, the argument (i.e. next > argument) is used to match all remaining elements. Hence, we do not move > forward. Adding a comment to make this more clear. > > Also modified this class to have MoreOp field to hold the abbreviation of the > array, since we want to use the provided abbreviation (rather than the default) > when rolling. > > Further, modified the method NaClBitCodeAbbrev::Add to "simplify the > abbreviation expression, so that redundant abbreviations of the form > > 1) VBR(6) Array(VBR(6)) > 2) Array(VBR(6)) > > are simplified to the second form. This doesn't cause a problem for existing > bitcode files, since it never generated case 1). On the other hand, it > simplifies the code here (and abbreviation comparison) to not have to worry > about generating case 1. Okay, so this is to peel out of the array during unrolling, when "Size" > getNumOperandInfos(). E.g., you start with VBR(6) Array(VBR(8)) (so #ops is 3) but you might be given a Size of 6. You conceptually up with: VBR(6) VBR(8) VBR(8) VBR(8) VBR(8) VBR(8) Array(VBR(8)). that's split into in the internal representation: AbbrevOps = [VBR(6) VBR(8) VBR(8) VBR(8) VBR(8) VBR(8)] MoreOps = [Array, VBR(8)] The fact that you *don't* bump NextOp to NextOp + 2 or something, is so that MoreOps can record both the "Array" and the type of the array elements into MoreOps? Then roll() gives you back: VBR(6) Array(VBR(8)) ? I think that's what your comment documenting this class was trying to explain, but I didn't quite catch it the first time w/out an example... (and maybe I'm still missing something). It seems like Simplify could go with this class, since it's somewhat tied to Roll and Unroll. However, there are some uses outside of this class. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:893: // read with the given abbreviation Abbrev. IndexDist is the On 2014/02/06 19:46:28, Karl wrote: > On 2014/02/05 00:24:16, jvoung (cr) wrote: > > Any idea how IndexDist looks like? > > > > I would think that most instructions are the same arity, so IndexDist would be > > pretty flat? > > Not sure what is being asked here. First, the (bccompress) distribution maps > separate on code and size first. Hence, this call will always be for the same > instruction of the same length. At this step, we want to improve on the unrolled > abbreviation for each index (in the index distribution map). > > Note that while the number of instances are flat, as you suggest, the importance > is NOT based on this concept. Rather, it sorts them based on the distribution of > values for the index, skewing values with large indices to the beginning of the > (sorted) distribution. Sorry, just lost a bit of how the whole thing was structured and was wondering if one of the layers could be dropped. The IndexDist seemed to have a pretty flat distribution. Okay, so this layer was responsible for peeking down a layer and seeing how skewed things are. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:1013: // Install candidate abbreviations. On 2014/02/06 19:46:28, Karl wrote: > Modified this code to sort the abbreviations by the number of instances, so that > if multiple abbreviations apply to the record, the abbreviation with the largest > number of instances will be chosen first. What are some examples of when the two *different* abbreviation can apply to the same record? https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... tools/pnacl-bccompress/pnacl-bccompress.cpp:711: // The set of abbreviations and corresponding number instancs. instances https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... tools/pnacl-bccompress/pnacl-bccompress.cpp:736: // this number of instances. When do collisions happen? Examples? How come it doesn't add the NumInstances to the Pos->second, and the abbrevs remain separately counted? I guess bumping it will make you need to adjust MaxInstances, etc. and update the range as below. https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... tools/pnacl-bccompress/pnacl-bccompress.cpp:1110: if (Abbrevs == 0) return; when can Abbrevs == 0, if Abbrevs is returned by GetGlobalAbbrevs(), which checks if Abbrevs == 0 and creates a new one?
https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:645: AbbrevOp = Abbrev->getOperandInfo(NextOp+1); On 2014/02/06 23:19:36, jvoung (cr) wrote: > On 2014/02/06 19:46:28, Karl wrote: > > On 2014/02/05 00:24:16, jvoung (cr) wrote: > > > How come NextOp isn't bumped for Array? > > > > When array appears in an abbreviation expression, the argument (i.e. next > > argument) is used to match all remaining elements. Hence, we do not move > > forward. Adding a comment to make this more clear. > > > > Also modified this class to have MoreOp field to hold the abbreviation of the > > array, since we want to use the provided abbreviation (rather than the > default) > > when rolling. > > > > Further, modified the method NaClBitCodeAbbrev::Add to "simplify the > > abbreviation expression, so that redundant abbreviations of the form > > > > 1) VBR(6) Array(VBR(6)) > > 2) Array(VBR(6)) > > > > are simplified to the second form. This doesn't cause a problem for existing > > bitcode files, since it never generated case 1). On the other hand, it > > simplifies the code here (and abbreviation comparison) to not have to worry > > about generating case 1. > > > Okay, so this is to peel out of the array during unrolling, when "Size" > > getNumOperandInfos(). > > E.g., you start with VBR(6) Array(VBR(8)) (so #ops is 3) > > but you might be given a Size of 6. You conceptually up with: > > VBR(6) VBR(8) VBR(8) VBR(8) VBR(8) VBR(8) Array(VBR(8)). > > that's split into in the internal representation: > > AbbrevOps = [VBR(6) VBR(8) VBR(8) VBR(8) VBR(8) VBR(8)] > MoreOps = [Array, VBR(8)] > > The fact that you *don't* bump NextOp to NextOp + 2 or something, is so that > MoreOps can record both the "Array" and the type of the array elements into > MoreOps? > > Then roll() gives you back: VBR(6) Array(VBR(8)) ? > > I think that's what your comment documenting this class was trying to explain, > but I didn't quite catch it the first time w/out an example... (and maybe I'm > still missing something). It seems like Simplify could go with this class, since > it's somewhat tied to Roll and Unroll. However, there are some uses outside of > this class. > Ok. I've updated the comment with an example that should hopefully make this clearer. Also note that I use Simplify while reading, in case an external code generator generates a non-simplified abbreviation. Further, since only Print and Simplify methods exist in NaClBitCodes.cpp, they are (currently) only included in this executable. https://codereview.chromium.org/154603002/diff/1/tools/pnacl-bccompress/pnacl... tools/pnacl-bccompress/pnacl-bccompress.cpp:1013: // Install candidate abbreviations. On 2014/02/06 23:19:36, jvoung (cr) wrote: > On 2014/02/06 19:46:28, Karl wrote: > > Modified this code to sort the abbreviations by the number of instances, so > that > > if multiple abbreviations apply to the record, the abbreviation with the > largest > > number of instances will be chosen first. > > > What are some examples of when the two *different* abbreviation can apply to the > same record? It may be the case that value index 3 has 300 instances of 1, and value index 4 has 1000 instances of 2. Each produce a new abbreviation. Some subset of the 1000 instances with value index 4 being 2 also may have value index 3 being 1. In those cases, we want to make sure that the bitcode writer chooses the second abbreviation over the first. In the current scheme, that means we want the second abbreviation to appear before the first, so that GetRecordAbbrevIndex will choose the second abbreviation (GetRecordAbbrevIndex searches the abbreviations linearly, choosing the first abbreviation that generates the smallest number of bits). Adding comment to make this more clear. https://codereview.chromium.org/154603002/diff/60001/include/llvm/Bitcode/NaC... File include/llvm/Bitcode/NaCl/NaClBitCodes.h (right): https://codereview.chromium.org/154603002/diff/60001/include/llvm/Bitcode/NaC... include/llvm/Bitcode/NaCl/NaClBitCodes.h:158: if (isEncoding() && (Encoding)Enc == Array) On 2014/02/06 20:37:19, jvoung (cr) wrote: > same as isArrayOp() ? Done. https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... File lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp (right): https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:41: Stream << "??"; On 2014/02/06 20:37:19, jvoung (cr) wrote: > This shouldn't be possible right? Yes. Adding assert to cause failure. https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:45: Stream << "??"; On 2014/02/06 20:37:19, jvoung (cr) wrote: > Should be impossible? Yes. Adding assert to cause failure. https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:87: && Abbrev->OperandList.back().isArrayOp() On 2014/02/06 20:37:19, jvoung (cr) wrote: > How can the NaClBitCodeAbbrevOp::Array be the last thing in the list of > operands? Doesn't some other encoding op need to show up after, to know the type > of the array elements? You seem to be looking for the type of the array element > before the NaClBitCodeAbbrevOp::Array instead. Also, there isn't a restriction > on where the first "Op" in the sequence "Op Array(Op)" comes, since you are > starting from index 0? > > Or am I understanding this wrong? > > Does this come up because of something bccompress does? > It doesn't happen in the default set of abbrevs right? This is a nested loop, where we are incrementally copying elements from this to Abbrev. As such, the array element can appear last (i.e. added by the last iteration). Apparently this wasn't clear. Rewriting to make this more explicit (and simplier). https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... tools/pnacl-bccompress/pnacl-bccompress.cpp:711: // The set of abbreviations and corresponding number instancs. On 2014/02/06 23:19:37, jvoung (cr) wrote: > instances Done. https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... tools/pnacl-bccompress/pnacl-bccompress.cpp:736: // this number of instances. On 2014/02/06 23:19:37, jvoung (cr) wrote: > When do collisions happen? Examples? > > How come it doesn't add the NumInstances to the Pos->second, and the abbrevs > remain separately counted? > > I guess bumping it will make you need to adjust MaxInstances, etc. and update > the range as below. After thinking about this some more, the whole concept of min/max cutoffs being applied incrementally will not work. As mentioned above, we can have cases where refinements A->B->C and A->D->C need to be merged. Hence, I am rewriting this class to apply the filter at the end, rather than incrementally. https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... tools/pnacl-bccompress/pnacl-bccompress.cpp:1110: if (Abbrevs == 0) return; On 2014/02/06 23:19:37, jvoung (cr) wrote: > when can Abbrevs == 0, if Abbrevs is returned by GetGlobalAbbrevs(), which > checks if Abbrevs == 0 and creates a new one? Removed.
https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... File lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp (right): https://codereview.chromium.org/154603002/diff/60001/lib/Bitcode/NaCl/Reader/... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:87: && Abbrev->OperandList.back().isArrayOp() On 2014/02/07 19:37:29, Karl wrote: > On 2014/02/06 20:37:19, jvoung (cr) wrote: > > How can the NaClBitCodeAbbrevOp::Array be the last thing in the list of > > operands? Doesn't some other encoding op need to show up after, to know the > type > > of the array elements? You seem to be looking for the type of the array > element > > before the NaClBitCodeAbbrevOp::Array instead. Also, there isn't a restriction > > on where the first "Op" in the sequence "Op Array(Op)" comes, since you are > > starting from index 0? > > > > Or am I understanding this wrong? > > > > Does this come up because of something bccompress does? > > It doesn't happen in the default set of abbrevs right? > > This is a nested loop, where we are incrementally copying elements from this to > Abbrev. As such, the array element can appear last (i.e. added by the last > iteration). Apparently this wasn't clear. Rewriting to make this more explicit > (and simplier). Ah, I mixed up the "this" Abbrev and the new Abbrev... https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/60001/tools/pnacl-bccompress/p... tools/pnacl-bccompress/pnacl-bccompress.cpp:711: // The set of abbreviations and corresponding number instancs. On 2014/02/07 19:37:29, Karl wrote: > On 2014/02/06 23:19:37, jvoung (cr) wrote: > > instances > > Done. sorry missed this in last comment: number *of* instances https://codereview.chromium.org/154603002/diff/200001/lib/Bitcode/NaCl/Reader... File lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp (right): https://codereview.chromium.org/154603002/diff/200001/lib/Bitcode/NaCl/Reader... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:41: assert(false); llvm_unreachable("...") instead ? https://codereview.chromium.org/154603002/diff/200001/lib/Bitcode/NaCl/Reader... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:87: // Op Array(Op) -> Array(Op) perhaps check if that the abbrev is well formed: assert(!Op.isArrayOp() || i == (OperandList.size()-2)) or something like that? https://codereview.chromium.org/154603002/diff/200001/tools/pnacl-bccompress/... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/200001/tools/pnacl-bccompress/... tools/pnacl-bccompress/pnacl-bccompress.cpp:503: /// in [VBR(6), Lit(4), VBR(6), Array(VBR(6))]. For records of size 3, won't the fact that the abbreviation had an "Array(VBR(6))" at the end, not really be used? I guess that makes it more generally applicable? Or would it have been better to create fresh "[VBR(6), Lit(4), VBR(6)]" abbreviations instead of using Arrays as templates to unroll and reroll? It seems like all the Array is doing is pick a VBR size for you, or if it's Fixed vs VBR. The initial set coming from the NaClBitcodeWriter is Fixed 7 or 8, or Char6, or Fixed/VBR TypeIdNumBits. Are those really the most useful seedings? https://codereview.chromium.org/154603002/diff/200001/tools/pnacl-bccompress/... tools/pnacl-bccompress/pnacl-bccompress.cpp:903: // of instances will be chosen when generating compressed file. "when compressing a file" https://codereview.chromium.org/154603002/diff/200001/tools/pnacl-bccompress/... tools/pnacl-bccompress/pnacl-bccompress.cpp:949: errs() << "--\n"; if (TraceGeneratedAbbreviations) ?
https://codereview.chromium.org/154603002/diff/200001/lib/Bitcode/NaCl/Reader... File lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp (right): https://codereview.chromium.org/154603002/diff/200001/lib/Bitcode/NaCl/Reader... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:41: assert(false); On 2014/02/07 21:34:10, jvoung (cr) wrote: > llvm_unreachable("...") instead ? Done. https://codereview.chromium.org/154603002/diff/200001/lib/Bitcode/NaCl/Reader... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:87: // Op Array(Op) -> Array(Op) On 2014/02/07 21:34:10, jvoung (cr) wrote: > perhaps check if that the abbrev is well formed: > > assert(!Op.isArrayOp() || i == (OperandList.size()-2)) > > or something like that? Done. https://codereview.chromium.org/154603002/diff/200001/tools/pnacl-bccompress/... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/200001/tools/pnacl-bccompress/... tools/pnacl-bccompress/pnacl-bccompress.cpp:503: /// in [VBR(6), Lit(4), VBR(6), Array(VBR(6))]. On 2014/02/07 21:34:10, jvoung (cr) wrote: > For records of size 3, won't the fact that the abbreviation had an > "Array(VBR(6))" at the end, not really be used? I guess that makes it more > generally applicable? > > Or would it have been better to create fresh "[VBR(6), Lit(4), VBR(6)]" > abbreviations instead of using Arrays as templates to unroll and reroll? It > seems like all the Array is doing is pick a VBR size for you, or if it's Fixed > vs VBR. > > The initial set coming from the NaClBitcodeWriter is Fixed 7 or 8, or Char6, or > Fixed/VBR TypeIdNumBits. Are those really the most useful seedings? Your point is well taken. Removing the array explains the 10% I lost! Also, there are questions on whether we should be flooding the abbreviation index space with special cases (based on size and constant insertion). For now, I would rather use the more conservative approach unrolling gives us. https://codereview.chromium.org/154603002/diff/200001/tools/pnacl-bccompress/... tools/pnacl-bccompress/pnacl-bccompress.cpp:903: // of instances will be chosen when generating compressed file. On 2014/02/07 21:34:10, jvoung (cr) wrote: > "when compressing a file" Done. https://codereview.chromium.org/154603002/diff/200001/tools/pnacl-bccompress/... tools/pnacl-bccompress/pnacl-bccompress.cpp:949: errs() << "--\n"; On 2014/02/07 21:34:10, jvoung (cr) wrote: > if (TraceGeneratedAbbreviations) ? Done.
otherwise lgtm https://codereview.chromium.org/154603002/diff/260001/lib/Bitcode/NaCl/Reader... File lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp (right): https://codereview.chromium.org/154603002/diff/260001/lib/Bitcode/NaCl/Reader... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:43: assert(false); can remove the assert(false) then, here and below https://codereview.chromium.org/154603002/diff/260001/tools/pnacl-bccompress/... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/260001/tools/pnacl-bccompress/... tools/pnacl-bccompress/pnacl-bccompress.cpp:560: /// abbreviation. If simplify is true, we simplify the first simplify -> Simplify
Message was sent while issue was closed.
Committed patchset #5 manually as rcc1b8c7 (presubmit successful).
Message was sent while issue was closed.
https://codereview.chromium.org/154603002/diff/260001/lib/Bitcode/NaCl/Reader... File lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp (right): https://codereview.chromium.org/154603002/diff/260001/lib/Bitcode/NaCl/Reader... lib/Bitcode/NaCl/Reader/NaClBitCodes.cpp:43: assert(false); On 2014/02/07 23:50:27, jvoung (cr) wrote: > can remove the assert(false) then, here and below Done. https://codereview.chromium.org/154603002/diff/260001/tools/pnacl-bccompress/... File tools/pnacl-bccompress/pnacl-bccompress.cpp (right): https://codereview.chromium.org/154603002/diff/260001/tools/pnacl-bccompress/... tools/pnacl-bccompress/pnacl-bccompress.cpp:560: /// abbreviation. If simplify is true, we simplify the On 2014/02/07 23:50:27, jvoung (cr) wrote: > first simplify -> Simplify Done. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
