| Index: include/llvm/Bitcode/NaCl/NaClBitcodeDist.h
|
| diff --git a/include/llvm/Bitcode/NaCl/NaClBitcodeDist.h b/include/llvm/Bitcode/NaCl/NaClBitcodeDist.h
|
| new file mode 100644
|
| index 0000000000000000000000000000000000000000..54b16a4cac361f0584ae5118eeb628704d60bd03
|
| --- /dev/null
|
| +++ b/include/llvm/Bitcode/NaCl/NaClBitcodeDist.h
|
| @@ -0,0 +1,507 @@
|
| +//===- NaClBitcodeDist.h ---------------------------------------*- C++ -*-===//
|
| +// Maps distributions of values in PNaCl bitcode files.
|
| +//
|
| +// The LLVM Compiler Infrastructure
|
| +//
|
| +// This file is distributed under the University of Illinois Open Source
|
| +// License. See LICENSE.TXT for details.
|
| +//
|
| +//===----------------------------------------------------------------------===//
|
| +//
|
| +// Creates a (nestable) distribution map of values in PNaCl bitcode.
|
| +// The domain of these maps is the set of record values being
|
| +// tracked. The range is the information associated with each block
|
| +// and/or record value, including the number of instances of that value. The
|
| +// distribution map is nested if the range element contains another
|
| +// distribution map.
|
| +//
|
| +// The goal of distribution maps is to build a (histogram)
|
| +// distribution of values in bitcode records and blocks, of a PNaCl
|
| +// bitcode file. From appropriately built distribution maps, one can
|
| +// infer possible new abbreviations that can be used in the PNaCl
|
| +// bitcode file. Hence, one of the primary goals of distribution maps
|
| +// is to support tools pnacl-bcanalyzer and pnacl-bccompress.
|
| +//
|
| +// Distribution maps are constructed either from NaClBitcodeBlock's or
|
| +// NaClBitcodeRecord's (in NaClBitcodeParser.h), but not both. It is
|
| +// assumed that only blocks, or records (but not both) are added to a
|
| +// distribution map. To add data to the distribution map, one calls
|
| +// AddRecord and/or AddBlock. If the distribution map contains record
|
| +// values, you must call AddRecord for each record to be put into the
|
| +// distribution map. If the distribution map contains block values (i.e.
|
| +// block ID's), you must call AddBlock for each block to be put into
|
| +// the distribution map.
|
| +//
|
| +// While it may counterintuitive, one can call both AddRecord and
|
| +// AddBlock, for each corresponding record and block processed. The
|
| +// reason for this is that an internal flag StorageKind is kept within
|
| +// distribution maps. If the flag value doesn't correspond to the
|
| +// type of called add method, no update is done. This behaviour is
|
| +// done so that nested distribution maps can be updated via blind
|
| +// calls in NaClBitcodeAnalyzer.cpp.
|
| +//
|
| +// Via inheritance, and overriding the (virtual) AddRecord
|
| +// method for a distribution map, we can redirect the add to look up
|
| +// the distribution element associated with the block of the record,
|
| +// and then update the corresponding record distribution map. In general,
|
| +// it only makes sense (for nested distribution maps) to be able to
|
| +// redirect record additions. Redirecting blocks within a record (since
|
| +// a record is only associated with one block) does not make sense. Hence,
|
| +// we have not made AddBlock virtual.
|
| +//
|
| +// When updating a block distribution map, the domain value is the
|
| +// BlockID of the corresponding block being added.
|
| +//
|
| +// On the other hand, values associated with record distribution maps
|
| +// are many possible values (the code, the abbreviation, the values
|
| +// etc). To make the API uniform, record distribution maps are updated
|
| +// using NaClBitcodeRecords (in NaClBitcodeParser.h). The values from
|
| +// the record are defined by the extract method GetValueList, and
|
| +// added via method AddRecord.
|
| +//
|
| +// Distribution maps are implemented using two classes:
|
| +//
|
| +// NaClBitcodeDist
|
| +// A generic distribution map.
|
| +//
|
| +// NaClBitcodeDistElement
|
| +// The implementation of elements in the range of the distribution
|
| +// map.
|
| +//
|
| +// The code has been written to put the (virtual) logic of
|
| +// distribution maps into derived classes of NaClBitcodeDistElement
|
| +// whenever possible. This is intentional, in that it keeps all
|
| +// knowledge of how to handle/print elements in one class. However,
|
| +// because some distributions have external data that is needed by all
|
| +// elements, the virtual methods of class NaClBitcodeDist can be
|
| +// overridden, and not delegate to NaClBitcodeDistElement.
|
| +//
|
| +// To do this, an NaClBitcodeDist requires a "sentinel" (derived)
|
| +// instance of NaClBitcodeDistElement. This sentinel is used to define
|
| +// behaviour needed by distribution maps.
|
| +//
|
| +// By having control passed to the (derived) instances of
|
| +// NaClBitcodeDistElement, it also makes handling nested distributions
|
| +// relatively easy. One just extends the methods AddRecord and/or
|
| +// AddBlock to also update the corresponding nested distribution.
|
| +//
|
| +// The exception to this rule is printing, since header information is
|
| +// dependent on properties of any possible nested distribution maps
|
| +// (for example, we copy column headers after each nested distribution
|
| +// map so that it is easier to read the output). To fix this, we let
|
| +// virtual method NaClBitcodeDistElement::GetNestedDistributions
|
| +// return the array of nested distribution pointers for the nested
|
| +// distributions of that map. The order in the array is the order the
|
| +// nested distributions will be printed. Typically, this is
|
| +// implemented as a field of the distribution element, and is
|
| +// initialized to contain the pointers of all nested distributions in
|
| +// the element. This field can then be returned from method
|
| +// GetNestedDistributions.
|
| +//
|
| +// Distribution maps are sortable (via method GetDistribution). The
|
| +// purpose of sorting is to find interesting elements. This is done by
|
| +// sorting the values in the domain of the distribution map, based on
|
| +// the GetImportance method of the range element.
|
| +//
|
| +// Method GetImportance defines how (potentially) interesting the
|
| +// value is in the distribution. "Interesting" is based on the notion
|
| +// of how likely will the value show a case where adding an
|
| +// abbreviation will shrink the size of the corresponding bitcode
|
| +// file. For most distributions, the number of instances associated
|
| +// with the value is the best measure.
|
| +//
|
| +// However, for cases where multiple domain entries are created for
|
| +// the same NaClBitcodeRecord (i.e. method GetValueList defines more
|
| +// than one value), things are not so simple.
|
| +
|
| +// For example, for some distributions (such as value index distributions)
|
| +// the numbers of instances isn't sufficient. In such cases, you may
|
| +// have to look at nested distributions to find important cases.
|
| +//
|
| +// In the case of value index distributions, when the size of the
|
| +// records is the same, all value indices have the same number of
|
| +// instances. In this case, "interesting" may be measured in terms of
|
| +// the (nested) distribution of the values that can appear at that
|
| +// index, and how often each value appears.
|
| +//
|
| +// The larger the importance, the more interesting the value is
|
| +// considered, and sorting is based on moving interesting values to
|
| +// the front of the sorted list.
|
| +//
|
| +// When printing distribution maps, the values are sorted based on
|
| +// the importance. By default, importance is based on the number of
|
| +// times the value appears in records, putting the most used values
|
| +// at the top of the printed list.
|
| +//
|
| +// Since sorting is expensive, the sorted distribution is built once
|
| +// and cached. This cache is flushed whenever the distribution map is
|
| +// updated, so that a new sorted distribuition will be generated.
|
| +//
|
| +// Printing of distribution maps are stylized, so that virtuals can
|
| +// easily fill in the necessary data.
|
| +//
|
| +// For concrete instances of NaClBitcodeDistElement, the following
|
| +// virtual method must be defined:
|
| +//
|
| +// CreateElement
|
| +// Creates a new instance of the distribution element, to
|
| +// be put into the corresponding distribution map when a new
|
| +// value is added to the distribution map.
|
| +//
|
| +// In addition, if the distribution element is based on record values,
|
| +// the virtual method GetValueList must be defined, to extract values
|
| +// out of the bitcode record.
|
| +
|
| +#ifndef LLVM_BITCODE_NACL_NACLBITCODEDIST_H
|
| +#define LLVM_BITCODE_NACL_NACLBITCODEDIST_H
|
| +
|
| +#include "llvm/Bitcode/NaCl/NaClBitcodeParser.h"
|
| +#include "llvm/Support/Casting.h"
|
| +#include "llvm/Support/Format.h"
|
| +#include "llvm/Support/raw_ostream.h"
|
| +
|
| +#include <map>
|
| +#include <vector>
|
| +
|
| +namespace llvm {
|
| +
|
| +/// The domain type of PNaCl bitcode record distribution maps.
|
| +typedef uint64_t NaClBitcodeDistValue;
|
| +
|
| +/// Base class of the range type of PNaCl bitcode record distribution
|
| +/// maps.
|
| +class NaClBitcodeDistElement;
|
| +
|
| +/// Type defining the list of values extracted from the corresponding
|
| +/// bitcode record. Typically, the list size is one. However, there
|
| +/// are cases where a record defines more than one value (i.e. value
|
| +/// indices). Hence, this type defines the more generic API for
|
| +/// values.
|
| +typedef std::vector<NaClBitcodeDistValue> ValueListType;
|
| +
|
| +typedef ValueListType::const_iterator ValueListIterator;
|
| +
|
| +/// Defines a PNaCl bitcode record distribution map. The distribution
|
| +/// map is a map from a (record) value, to the corresponding data
|
| +/// associated with that value. Assumes distributions elements are
|
| +/// instances of NaClBitcodeDistElement.
|
| +class NaClBitcodeDist {
|
| + NaClBitcodeDist(const NaClBitcodeDist&) LLVM_DELETED_FUNCTION;
|
| + void operator=(const NaClBitcodeDist&) LLVM_DELETED_FUNCTION;
|
| + friend class NaClBitcodeDistElement;
|
| +
|
| +public:
|
| + /// Define kinds for isa, dyn_cast, etc. support (see
|
| + /// llvm/Support/Casting.h). Only defined for concrete classes.
|
| + enum NaClBitcodeDistKind {
|
| + RD_Dist, // class NaClBitcodeDist.
|
| + RD_BlockDist, // class NaClBitcodeBlockDist.
|
| + RD_BlockDistLast,
|
| + RD_CodeDist, // class NaClBitcodeCodeDist.
|
| + RD_CodeDistLast,
|
| + RD_AbbrevDist, // class NaClBitcodeAbbrevDist.
|
| + RD_AbbrevDistLast,
|
| + RD_SubblockDist, // class NaClBlockSubblockDist.
|
| + RD_SubblockDistLast,
|
| + RD_ValueDist, // class NaClBitcodeValueDist.
|
| + RD_ValueDistLast,
|
| + RD_DistLast
|
| + };
|
| +
|
| + NaClBitcodeDistKind getKind() const { return Kind; }
|
| +
|
| +private:
|
| + const NaClBitcodeDistKind Kind;
|
| +
|
| + static bool classof(const NaClBitcodeDist *Dist) {
|
| + return Dist->getKind() >= RD_Dist && Dist->getKind() < RD_DistLast;
|
| + }
|
| +
|
| +public:
|
| + /// Type defining the mapping used to define the distribution.
|
| + typedef std::map<NaClBitcodeDistValue, NaClBitcodeDistElement*> MappedElement;
|
| +
|
| + typedef MappedElement::const_iterator const_iterator;
|
| +
|
| + /// Type defining a pair of values used to sort the
|
| + /// distribution. The first element is defined by method
|
| + /// GetImportance, and the second is the distribution value
|
| + /// associated with that importance.
|
| + typedef std::pair<double, NaClBitcodeDistValue> DistPair;
|
| +
|
| + /// Type defining the sorted list of (domain) values in the
|
| + /// corresponding distribution map.
|
| + typedef std::vector<DistPair> Distribution;
|
| +
|
| + /// Defines whether blocks or records are stored in the distribution map.
|
| + /// Used to decide if AddRecord/AddBlock methods should fire.
|
| + enum StorageSelector {
|
| + BlockStorage,
|
| + RecordStorage
|
| + };
|
| +
|
| + NaClBitcodeDist(StorageSelector StorageKind,
|
| + const NaClBitcodeDistElement *Sentinel,
|
| + NaClBitcodeDistKind Kind=RD_Dist)
|
| + : Kind(Kind), StorageKind(StorageKind), Sentinel(Sentinel),
|
| + TableMap(), CachedDistribution(0), Total(0) {
|
| + }
|
| +
|
| + virtual ~NaClBitcodeDist();
|
| +
|
| + /// Number of elements in the distribution map.
|
| + size_t size() const {
|
| + return TableMap.size();
|
| + }
|
| +
|
| + /// Iterator at beginning of distribution map.
|
| + const_iterator begin() const {
|
| + return TableMap.begin();
|
| + }
|
| +
|
| + /// Iterator at end of distribution map.
|
| + const_iterator end() const {
|
| + return TableMap.end();
|
| + }
|
| +
|
| + /// Returns true if the distribution map is empty.
|
| + bool empty() const {
|
| + return TableMap.empty();
|
| + }
|
| +
|
| + /// Returns the element associated with the given distribution
|
| + /// value. Creates the element if needed.
|
| + inline NaClBitcodeDistElement *GetElement(NaClBitcodeDistValue Value);
|
| +
|
| + /// Returns the element associated with the given distribution
|
| + /// value.
|
| + NaClBitcodeDistElement *at(NaClBitcodeDistValue Value) const {
|
| + return TableMap.at(Value);
|
| + }
|
| +
|
| + // Creates a new instance of this element for the given value. Used
|
| + // by class NaClBitcodeDist to create instances. Default method
|
| + // simply dispatches to the CreateElement method of the sentinel.
|
| + virtual NaClBitcodeDistElement *CreateElement(
|
| + NaClBitcodeDistValue Value) const;
|
| +
|
| + /// Interrogates the block record, and returns the corresponding
|
| + /// values that are being tracked by the distribution map. Default
|
| + /// method simply dispatches to the GetValueList of the sentinel.
|
| + virtual void GetValueList(const NaClBitcodeRecord &Record,
|
| + ValueListType &ValueList) const;
|
| +
|
| + /// Returns the total number of instances held in the distribution
|
| + /// map.
|
| + unsigned GetTotal() const {
|
| + return Total;
|
| + }
|
| +
|
| + /// Adds the value(s) in the given bitcode record to the
|
| + /// distribution map. The value(s) based on method GetValueList.
|
| + /// Note: Default requires that GetStorageKind() == RecordStorage.
|
| + /// Override if you want special handling for nested distributions
|
| + /// in a block distribution map.
|
| + virtual void AddRecord(const NaClBitcodeRecord &Record);
|
| +
|
| + /// Adds the BlockID of the given bitcode block to the distribution
|
| + /// map, if applicable (based on the value of method UseBlockID).
|
| + /// Note: Requires that GetStorageKind() == BlockStorage.
|
| + virtual void AddBlock(const NaClBitcodeBlock &Block);
|
| +
|
| + /// Builds the distribution associated with the distribution map.
|
| + /// Warning: The distribution is cached, and hence, only valid while
|
| + /// it's contents is not changed.
|
| + const Distribution *GetDistribution() const {
|
| + if (CachedDistribution == 0) Sort();
|
| + return CachedDistribution;
|
| + }
|
| +
|
| + /// Prints out the contents of the distribution map to Stream.
|
| + void Print(raw_ostream &Stream, const std::string &Indent) const;
|
| +
|
| + void Print(raw_ostream &Stream) const {
|
| + std::string Indent;
|
| + Print(Stream, Indent);
|
| + }
|
| +
|
| +protected:
|
| + /// If the distribution is cached, remove it. Should be called
|
| + /// whenever the distribution map is changed.
|
| + void RemoveCachedDistribution() const {
|
| + if (CachedDistribution) {
|
| + delete CachedDistribution;
|
| + CachedDistribution = 0;
|
| + }
|
| + }
|
| +
|
| + /// Sorts the distribution, based on the importance of each element.
|
| + void Sort() const;
|
| +
|
| +private:
|
| + // Defines whether values in distribution map are from blocks or records.
|
| + const StorageSelector StorageKind;
|
| + // Sentinel element used to do generic operations for distribution.
|
| + const NaClBitcodeDistElement *Sentinel;
|
| + // Map from the distribution value to the corresponding distribution
|
| + // element.
|
| + MappedElement TableMap;
|
| + // Pointer to the cached distribution.
|
| + mutable Distribution *CachedDistribution;
|
| + // The total number of instances in the map.
|
| + unsigned Total;
|
| +};
|
| +
|
| +/// Defines the element type of a PNaCl bitcode distribution map.
|
| +/// This is the base class for all element types used in
|
| +/// NaClBitcodeDist. By default, only the number of instances
|
| +/// of the corresponding distribution values is recorded.
|
| +class NaClBitcodeDistElement {
|
| + NaClBitcodeDistElement(const NaClBitcodeDistElement &)
|
| + LLVM_DELETED_FUNCTION;
|
| + void operator=(const NaClBitcodeDistElement &)
|
| + LLVM_DELETED_FUNCTION;
|
| +
|
| +public:
|
| + /// Define kinds for isa, dyn_cast, etc. support. Only defined
|
| + /// for concrete classes.
|
| + enum NaClBitcodeDistElementKind {
|
| + RDE_Dist, // class NaClBitcodeDistElement.
|
| + RDE_AbbrevDist, // class NaClBitcodeAbbrevDistElement.
|
| + RDE_AbbrevDistLast,
|
| + RDE_BitsDist, // class NaClBitcodeBitsDistElement.
|
| + RDE_BitsAndAbbrevsDist, // class NaClBitcodeBitsAndAbbrevsDistElement.
|
| + RDE_CodeDist, // class NaClBitcodeCodeDistElement.
|
| + RDE_CompressCodeDist, // class NaClCompressCodeDistElement.
|
| + RDE_CompressCodeDistLast,
|
| + RDE_CodeDistLast,
|
| + RDE_BitsAndAbbrevsDistLast,
|
| + RDE_BitsDistLast,
|
| + RDE_BlockDist, // class NaClBitcodeBlockDistElement.
|
| + RDE_NaClAnalBlockDist, // class NaClAnalyzerBlockDistElement.
|
| + RDE_NaClAnalBlockDistLast,
|
| + RDE_PNaClCompressBlockDist, // class NaClCompressBlockDistElement.
|
| + RDE_PNaClCompressBlockDistLast,
|
| + RDE_BlockDistLast,
|
| + RDE_SizeDist, // class NaClBitcodeSizeDistElement.
|
| + RDE_SizeDistLast,
|
| + RDE_SubblockDist, // class NaClBitcodeSubblockDistElement
|
| + RDE_SubblockDistLast,
|
| + RDE_ValueDist, // class NaClBitcodeValueDistElement.
|
| + RDE_ValueDistLast,
|
| + RDE_ValueIndexDist, // class NaClBitcodeValueIndexDistElement.
|
| + RDE_ValueIndexDistLast,
|
| + RDE_DistLast
|
| + };
|
| +
|
| + NaClBitcodeDistElementKind getKind() const { return Kind; }
|
| +
|
| + static bool classof(const NaClBitcodeDistElement *Element) {
|
| + return Element->getKind() >= RDE_Dist && Element->getKind() < RDE_DistLast;
|
| + }
|
| +
|
| +private:
|
| + const NaClBitcodeDistElementKind Kind;
|
| +
|
| +public:
|
| +
|
| + // Constructor to use in derived classes.
|
| + NaClBitcodeDistElement(
|
| + NaClBitcodeDistElementKind Kind=RDE_Dist)
|
| + : Kind(Kind), NumInstances(0)
|
| + {}
|
| +
|
| + virtual ~NaClBitcodeDistElement();
|
| +
|
| + // Adds an instance of the given record to this element.
|
| + virtual void AddRecord(const NaClBitcodeRecord &Record);
|
| +
|
| + // Adds an instance of the given block to this element.
|
| + virtual void AddBlock(const NaClBitcodeBlock &Block);
|
| +
|
| + // Returns the number of instances associated with this element.
|
| + unsigned GetNumInstances() const {
|
| + return NumInstances;
|
| + }
|
| +
|
| + // Creates a new instance of this element for the given value. Used
|
| + // by class NaClBitcodeDist to create instances.
|
| + virtual NaClBitcodeDistElement *CreateElement(
|
| + NaClBitcodeDistValue Value) const = 0;
|
| +
|
| + /// Interrogates the block record, and returns the corresponding
|
| + /// values that are being tracked by the distribution map. Must be
|
| + /// defined in derived classes.
|
| + virtual void GetValueList(const NaClBitcodeRecord &Record,
|
| + ValueListType &ValueList) const;
|
| +
|
| + // Returns the importance of this element. In many cases, it will be
|
| + // the number of instances associated with it. However, it need not
|
| + // be correlated to the number of instance. Used to sort the
|
| + // distribution map, where values with larger importance appear
|
| + // first. Value is the domain value associated with the element in
|
| + // the distribution map.
|
| + virtual double GetImportance(NaClBitcodeDistValue Value) const;
|
| +
|
| + /// Returns the title to use when printing the title associated
|
| + /// with instances of this distribution element.
|
| + virtual const char *GetTitle() const;
|
| +
|
| + /// Prints out the title of the distribution map associated with
|
| + /// instances of this distribution element.
|
| + virtual void PrintTitle(raw_ostream &Stream,
|
| + const NaClBitcodeDist *Distribution) const;
|
| +
|
| + /// Returns the header to use when printing the value associated
|
| + /// with instances of this distribution element.
|
| + virtual const char *GetValueHeader() const;
|
| +
|
| + /// Prints out header for row of statistics associated with instances
|
| + /// of this distribution element.
|
| + virtual void PrintStatsHeader(raw_ostream &Stream) const;
|
| +
|
| + /// Prints out the header to the printed distribution map associated
|
| + /// with instances of this distribution element.
|
| + void PrintHeader(raw_ostream &Stream) const;
|
| +
|
| + /// Prints out statistics for the row with the given value.
|
| + virtual void PrintRowStats(raw_ostream &Stream,
|
| + const NaClBitcodeDist *Distribution) const;
|
| +
|
| + /// Prints out Value (in a row) to Stream.
|
| + virtual void PrintRowValue(raw_ostream &Stream,
|
| + NaClBitcodeDistValue Value,
|
| + const NaClBitcodeDist *Distribution) const;
|
| +
|
| + /// Prints out a row in the printed distribution map.
|
| + virtual void PrintRow(raw_ostream &Stream,
|
| + NaClBitcodeDistValue Value,
|
| + const NaClBitcodeDist *Distribution) const;
|
| +
|
| + /// Returns a pointer to the list of nested distributions that
|
| + /// should be printed when this element is printed. Return 0 if no
|
| + /// nested distributions should be printed.
|
| + virtual const SmallVectorImpl<NaClBitcodeDist*> *
|
| + GetNestedDistributions() const;
|
| +
|
| + /// Prints out any nested distributions, if defined for the element.
|
| + /// Returns true if a nested distribution was printed.
|
| + bool PrintNestedDistIfApplicable(
|
| + raw_ostream &Stream, const std::string &Indent) const;
|
| +
|
| +private:
|
| + // The number of instances associated with this element.
|
| + unsigned NumInstances;
|
| +};
|
| +
|
| +inline NaClBitcodeDistElement *NaClBitcodeDist::
|
| +GetElement(NaClBitcodeDistValue Value) {
|
| + if (TableMap.find(Value) == TableMap.end()) {
|
| + TableMap[Value] = CreateElement(Value);
|
| + }
|
| + return TableMap[Value];
|
| +}
|
| +
|
| +}
|
| +
|
| +#endif
|
|
|