gcc/libstdc++-v3/doc/xml/manual/bitmap_allocator.xml - Issue 3050029: [gcc] GCC 4.5.0=>4.5.1

Unified Diff: gcc/libstdc++-v3/doc/xml/manual/bitmap_allocator.xml

Issue 3050029: [gcc] GCC 4.5.0=>4.5.1 (Closed) Base URL: ssh://git@gitrw.chromium.org:9222/nacl-toolchain.git

Patch Set: Created 10 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

« no previous file with comments | « gcc/libstdc++-v3/doc/xml/manual/backwards_compatibility.xml ('k') | gcc/libstdc++-v3/doc/xml/manual/concurrency.xml » ('j') | no next file with comments »
Expand Comments ('e') | Collapse Comments ('c') | Hide Comments ('s')

Index: gcc/libstdc++-v3/doc/xml/manual/bitmap_allocator.xml

diff --git a/gcc/libstdc++-v3/doc/xml/manual/bitmap_allocator.xml b/gcc/libstdc++-v3/doc/xml/manual/bitmap_allocator.xml

deleted file mode 100644

index 1815a39c4c0257303df1e4b28392c24f261f5b52..0000000000000000000000000000000000000000

--- a/gcc/libstdc++-v3/doc/xml/manual/bitmap_allocator.xml

+++ /dev/null

@@ -1,559 +0,0 @@

-<sect1 id="manual.ext.allocator.bitmap" xreflabel="mt allocator">

-<?dbhtml filename="bitmap_allocator.html"?>

-<sect1info>

- <keywordset>

- <keyword>

- ISO C++

- </keyword>

- <keyword>

- allocator

- </keyword>

- </keywordset>

-</sect1info>

-<title>bitmap_allocator</title>

-<para>

-</para>

-<sect2 id="allocator.bitmap.design" xreflabel="allocator.bitmap.design">

-<title>Design</title>

- <para>

- As this name suggests, this allocator uses a bit-map to keep track

- of the used and unused memory locations for it's book-keeping

- purposes.

- </para>

- <para>

- This allocator will make use of 1 single bit to keep track of

- whether it has been allocated or not. A bit 1 indicates free,

- while 0 indicates allocated. This has been done so that you can

- easily check a collection of bits for a free block. This kind of

- Bitmapped strategy works best for single object allocations, and

- with the STL type parameterized allocators, we do not need to

- choose any size for the block which will be represented by a

- single bit. This will be the size of the parameter around which

- the allocator has been parameterized. Thus, close to optimal

- performance will result. Hence, this should be used for node based

- containers which call the allocate function with an argument of 1.

- </para>

- <para>

- The bitmapped allocator's internal pool is exponentially growing.

- Meaning that internally, the blocks acquired from the Free List

- Store will double every time the bitmapped allocator runs out of

- memory.

- </para>

- <para>

- The macro <literal>__GTHREADS</literal> decides whether to use

- Mutex Protection around every allocation/deallocation. The state

- of the macro is picked up automatically from the gthr abstraction

- layer.

- </para>

-</sect2>

-<sect2 id="allocator.bitmap.impl" xreflabel="allocator.bitmap.impl">

-<title>Implementation</title>

-<sect3 id="bitmap.impl.free_list_store" xreflabel="Free List Store">

- <title>Free List Store</title>

- <para>

- The Free List Store (referred to as FLS for the remaining part of this

- document) is the Global memory pool that is shared by all instances of

- the bitmapped allocator instantiated for any type. This maintains a

- sorted order of all free memory blocks given back to it by the

- bitmapped allocator, and is also responsible for giving memory to the

- bitmapped allocator when it asks for more.

- </para>

- <para>

- Internally, there is a Free List threshold which indicates the

- Maximum number of free lists that the FLS can hold internally

- (cache). Currently, this value is set at 64. So, if there are

- more than 64 free lists coming in, then some of them will be given

- back to the OS using operator delete so that at any given time the

- Free List's size does not exceed 64 entries. This is done because

- a Binary Search is used to locate an entry in a free list when a

- request for memory comes along. Thus, the run-time complexity of

- the search would go up given an increasing size, for 64 entries

- however, lg(64) == 6 comparisons are enough to locate the correct

- free list if it exists.

- </para>

- <para>

- Suppose the free list size has reached it's threshold, then the

- largest block from among those in the list and the new block will

- be selected and given back to the OS. This is done because it

- reduces external fragmentation, and allows the OS to use the

- larger blocks later in an orderly fashion, possibly merging them

- later. Also, on some systems, large blocks are obtained via calls

- to mmap, so giving them back to free system resources becomes most

- important.

- </para>

- <para>

- The function _S_should_i_give decides the policy that determines

- whether the current block of memory should be given to the

- allocator for the request that it has made. That's because we may

- not always have exact fits for the memory size that the allocator

- requests. We do this mainly to prevent external fragmentation at

- the cost of a little internal fragmentation. Now, the value of

- this internal fragmentation has to be decided by this function. I

- can see 3 possibilities right now. Please add more as and when you

- find better strategies.

- </para>

-<orderedlist>

- <listitem><para>Equal size check. Return true only when the 2 blocks are of equal

-size.</para></listitem>

- <listitem><para>Difference Threshold: Return true only when the _block_size is

-greater than or equal to the _required_size, and if the _BS is > _RS

-by a difference of less than some THRESHOLD value, then return true,

-else return false. </para></listitem>

- <listitem><para>Percentage Threshold. Return true only when the _block_size is

-greater than or equal to the _required_size, and if the _BS is > _RS

-by a percentage of less than some THRESHOLD value, then return true,

-else return false.</para></listitem>

-</orderedlist>

- <para>

- Currently, (3) is being used with a value of 36% Maximum wastage per

- Super Block.

- </para>

-</sect3>

-<sect3 id="bitmap.impl.super_block" xreflabel="Super Block">

- <title>Super Block</title>

- <para>

- A super block is the block of memory acquired from the FLS from

- which the bitmap allocator carves out memory for single objects

- and satisfies the user's requests. These super blocks come in

- sizes that are powers of 2 and multiples of 32

- (_Bits_Per_Block). Yes both at the same time! That's because the

- next super block acquired will be 2 times the previous one, and

- also all super blocks have to be multiples of the _Bits_Per_Block

- value.

- </para>

- <para>

- How does it interact with the free list store?

- </para>

- <para>

- The super block is contained in the FLS, and the FLS is responsible for

- getting / returning Super Bocks to and from the OS using operator new

- as defined by the C++ standard.

- </para>

-</sect3>

-<sect3 id="bitmap.impl.super_block_data" xreflabel="Super Block Data">

- <title>Super Block Data Layout</title>

- <para>

- Each Super Block will be of some size that is a multiple of the

- number of Bits Per Block. Typically, this value is chosen as

- Bits_Per_Byte x sizeof(size_t). On an x86 system, this gives the

- figure 8 x 4 = 32. Thus, each Super Block will be of size 32

- x Some_Value. This Some_Value is sizeof(value_type). For now, let

- it be called 'K'. Thus, finally, Super Block size is 32 x K bytes.

- </para>

- <para>

- This value of 32 has been chosen because each size_t has 32-bits

- and Maximum use of these can be made with such a figure.

- </para>

- <para>

- Consider a block of size 64 ints. In memory, it would look like this:

- (assume a 32-bit system where, size_t is a 32-bit entity).

- </para>

-<table frame='all'>

-<title>Bitmap Allocator Memory Map</title>

-<tgroup cols='5' align='left' colsep='1' rowsep='1'>

-<colspec colname='c1'></colspec>

-<colspec colname='c2'></colspec>

-<colspec colname='c3'></colspec>

-<colspec colname='c4'></colspec>

-<colspec colname='c5'></colspec>

-<tbody>

- <row>

- <entry>268</entry>

- <entry>0</entry>

- <entry>4294967295</entry>

- <entry>Data -> Space for 64 ints</entry>

- </row>

-</tbody>

-</tgroup>

-</table>

- <para>

- The first Column(268) represents the size of the Block in bytes as

- seen by the Bitmap Allocator. Internally, a global free list is

- used to keep track of the free blocks used and given back by the

- bitmap allocator. It is this Free List Store that is responsible

- for writing and managing this information. Actually the number of

- bytes allocated in this case would be: 4 + 4 + (4x2) + (64x4) =

- 272 bytes, but the first 4 bytes are an addition by the Free List

- Store, so the Bitmap Allocator sees only 268 bytes. These first 4

- bytes about which the bitmapped allocator is not aware hold the

- value 268.

- </para>

- <para>

- What do the remaining values represent?</para>

- <para>

- The 2nd 4 in the expression is the sizeof(size_t) because the

- Bitmapped Allocator maintains a used count for each Super Block,

- which is initially set to 0 (as indicated in the diagram). This is

- incremented every time a block is removed from this super block

- (allocated), and decremented whenever it is given back. So, when

- the used count falls to 0, the whole super block will be given

- back to the Free List Store.

- </para>

- <para>

- The value 4294967295 represents the integer corresponding to the bit

- representation of all bits set: 11111111111111111111111111111111.

- </para>

- <para>

- The 3rd 4x2 is size of the bitmap itself, which is the size of 32-bits

- x 2,

- which is 8-bytes, or 2 x sizeof(size_t).

- </para>

-</sect3>

-<sect3 id="bitmap.impl.max_wasted" xreflabel="Max Wasted Percentage">

- <title>Maximum Wasted Percentage</title>

- <para>

- This has nothing to do with the algorithm per-se,

- only with some vales that must be chosen correctly to ensure that the

- allocator performs well in a real word scenario, and maintains a good

- balance between the memory consumption and the allocation/deallocation

- speed.

- </para>

- <para>

- The formula for calculating the maximum wastage as a percentage:

- </para>

- <para>

-(32 x k + 1) / (2 x (32 x k + 1 + 32 x c)) x 100.

- </para>

- <para>

- where k is the constant overhead per node (e.g., for list, it is

- 8 bytes, and for map it is 12 bytes) and c is the size of the

- base type on which the map/list is instantiated. Thus, suppose the

- type1 is int and type2 is double, they are related by the relation

- sizeof(double) == 2*sizeof(int). Thus, all types must have this

- double size relation for this formula to work properly.

- </para>

- <para>

- Plugging-in: For List: k = 8 and c = 4 (int and double), we get:

- 33.376%

- </para>

- <para>

-For map/multimap: k = 12, and c = 4 (int and double), we get: 37.524%

- </para>

- <para>

- Thus, knowing these values, and based on the sizeof(value_type), we may

- create a function that returns the Max_Wastage_Percentage for us to use.

- </para>

-</sect3>

-<sect3 id="bitmap.impl.allocate" xreflabel="Allocate">

- <title><function>allocate</function></title>

- <para>

- The allocate function is specialized for single object allocation

- ONLY. Thus, ONLY if n == 1, will the bitmap_allocator's

- specialized algorithm be used. Otherwise, the request is satisfied

- directly by calling operator new.

- </para>

- <para>

- Suppose n == 1, then the allocator does the following:

- </para>

- <orderedlist>

- <listitem>

- <para>

- Checks to see whether a free block exists somewhere in a region

- of memory close to the last satisfied request. If so, then that

- block is marked as allocated in the bit map and given to the

- user. If not, then (2) is executed.

- </para>

- </listitem>

- <listitem>

- <para>

- Is there a free block anywhere after the current block right

- up to the end of the memory that we have? If so, that block is

- found, and the same procedure is applied as above, and

- returned to the user. If not, then (3) is executed.

- </para>

- </listitem>

- <listitem>

- <para>

- Is there any block in whatever region of memory that we own

- free? This is done by checking

- </para>

- <itemizedlist>

- <listitem>

- <para>

- The use count for each super block, and if that fails then

- </para>

- </listitem>

- <listitem>

- <para>

- The individual bit-maps for each super block.

- </para>

- </listitem>

- </itemizedlist>

- <para>

- Note: Here we are never touching any of the memory that the

- user will be given, and we are confining all memory accesses

- to a small region of memory! This helps reduce cache

- misses. If this succeeds then we apply the same procedure on

- that bit-map as (1), and return that block of memory to the

- user. However, if this process fails, then we resort to (4).

- </para>

- </listitem>

- <listitem>

- <para>

- This process involves Refilling the internal exponentially

- growing memory pool. The said effect is achieved by calling

- _S_refill_pool which does the following:

- </para>

- <itemizedlist>

- <listitem>

- <para>

- Gets more memory from the Global Free List of the Required

- size.

- </para>

- </listitem>

- <listitem>

- <para>

- Adjusts the size for the next call to itself.

- </para>

- </listitem>

- <listitem>

- <para>

- Writes the appropriate headers in the bit-maps.

- </para>

- </listitem>

- <listitem>

- <para>

- Sets the use count for that super-block just allocated to 0

- (zero).

- </para>

- </listitem>

- <listitem>

- <para>

- All of the above accounts to maintaining the basic invariant

- for the allocator. If the invariant is maintained, we are

- sure that all is well. Now, the same process is applied on

- the newly acquired free blocks, which are dispatched

- accordingly.

- </para>

- </listitem>

- </itemizedlist>

- </listitem>

-</orderedlist>

-<para>

-Thus, you can clearly see that the allocate function is nothing but a

-combination of the next-fit and first-fit algorithm optimized ONLY for

-single object allocations.

-</para>

-</sect3>

-<sect3 id="bitmap.impl.deallocate" xreflabel="Deallocate">

- <title><function>deallocate</function></title>

- <para>

- The deallocate function again is specialized for single objects ONLY.

- For all n belonging to > 1, the operator delete is called without

- further ado, and the deallocate function returns.

- </para>

- <para>

- However for n == 1, a series of steps are performed:

- </para>

- <orderedlist>

- <listitem><para>

- We first need to locate that super-block which holds the memory

- location given to us by the user. For that purpose, we maintain

- a static variable _S_last_dealloc_index, which holds the index

- into the vector of block pairs which indicates the index of the

- last super-block from which memory was freed. We use this

- strategy in the hope that the user will deallocate memory in a

- region close to what he/she deallocated the last time around. If

- the check for belongs_to succeeds, then we determine the bit-map

- for the given pointer, and locate the index into that bit-map,

- and mark that bit as free by setting it.

- </para></listitem>

- <listitem><para>

- If the _S_last_dealloc_index does not point to the memory block

- that we're looking for, then we do a linear search on the block

- stored in the vector of Block Pairs. This vector in code is

- called _S_mem_blocks. When the corresponding super-block is

- found, we apply the same procedure as we did for (1) to mark the

- block as free in the bit-map.

- </para></listitem>

- </orderedlist>

- <para>

- Now, whenever a block is freed, the use count of that particular

- super block goes down by 1. When this use count hits 0, we remove

- that super block from the list of all valid super blocks stored in

- the vector. While doing this, we also make sure that the basic

- invariant is maintained by making sure that _S_last_request and

- _S_last_dealloc_index point to valid locations within the vector.

- </para>

-</sect3>

-<sect3 id="bitmap.impl.questions" xreflabel="Questions">

- <title>Questions</title>

- <sect4 id="bitmap.impl.question.1" xreflabel="Question 1">

- <title>1</title>

- <para>

-Q1) The "Data Layout" section is

-cryptic. I have no idea of what you are trying to say. Layout of what?

-The free-list? Each bitmap? The Super Block?

- </para>

- <para>

- The layout of a Super Block of a given

-size. In the example, a super block of size 32 x 1 is taken. The

-general formula for calculating the size of a super block is

-32 x sizeof(value_type) x 2^n, where n ranges from 0 to 32 for 32-bit

-systems.

- </para>

- </sect4>

- <sect4 id="bitmap.impl.question.2" xreflabel="Question 2">

- <title>2</title>

- <para>

- And since I just mentioned the

-term `each bitmap', what in the world is meant by it? What does each

-bitmap manage? How does it relate to the super block? Is the Super

-Block a bitmap as well?

- </para>

- <para>

- Each bitmap is part of a Super Block which is made up of 3 parts

- as I have mentioned earlier. Re-iterating, 1. The use count,

- 2. The bit-map for that Super Block. 3. The actual memory that

- will be eventually given to the user. Each bitmap is a multiple

- of 32 in size. If there are 32 x (2^3) blocks of single objects

- to be given, there will be '32 x (2^3)' bits present. Each 32

- bits managing the allocated / free status for 32 blocks. Since

- each size_t contains 32-bits, one size_t can manage up to 32

- blocks' status. Each bit-map is made up of a number of size_t,

- whose exact number for a super-block of a given size I have just

- mentioned.

- </para>

- </sect4>

- <sect4 id="bitmap.impl.question.3" xreflabel="Question 3">

- <title>3</title>

- <para>

- How do the allocate and deallocate functions work in regard to

- bitmaps?

- </para>

- <para>

- The allocate and deallocate functions manipulate the bitmaps and

- have nothing to do with the memory that is given to the user. As

- I have earlier mentioned, a 1 in the bitmap's bit field

- indicates free, while a 0 indicates allocated. This lets us

- check 32 bits at a time to check whether there is at lease one

- free block in those 32 blocks by testing for equality with

- (0). Now, the allocate function will given a memory block find

- the corresponding bit in the bitmap, and will reset it (i.e.,

- make it re-set (0)). And when the deallocate function is called,

- it will again set that bit after locating it to indicate that

- that particular block corresponding to this bit in the bit-map

- is not being used by anyone, and may be used to satisfy future

- requests.

- </para>

- <para>

- e.g.: Consider a bit-map of 64-bits as represented below:

- 1111111111111111111111111111111111111111111111111111111111111111

- </para>

- <para>

- Now, when the first request for allocation of a single object

- comes along, the first block in address order is returned. And

- since the bit-maps in the reverse order to that of the address

- order, the last bit (LSB if the bit-map is considered as a

- binary word of 64-bits) is re-set to 0.

- </para>

- <para>

- The bit-map now looks like this:

- 1111111111111111111111111111111111111111111111111111111111111110

- </para>

- </sect4>

-</sect3>

-<sect3 id="bitmap.impl.locality" xreflabel="Locality">

- <title>Locality</title>

- <para>

- Another issue would be whether to keep the all bitmaps in a

- separate area in memory, or to keep them near the actual blocks

- that will be given out or allocated for the client. After some

- testing, I've decided to keep these bitmaps close to the actual

- blocks. This will help in 2 ways.

- </para>

- <orderedlist>

- <listitem><para>Constant time access for the bitmap themselves, since no kind of

-look up will be needed to find the correct bitmap list or it's

-equivalent.</para></listitem>

- <listitem><para>And also this would preserve the cache as far as possible.</para></listitem>

- </orderedlist>

- <para>

- So in effect, this kind of an allocator might prove beneficial from a

- purely cache point of view. But this allocator has been made to try and

- roll out the defects of the node_allocator, wherein the nodes get

- skewed about in memory, if they are not returned in the exact reverse

- order or in the same order in which they were allocated. Also, the

- new_allocator's book keeping overhead is too much for small objects and

- single object allocations, though it preserves the locality of blocks

- very well when they are returned back to the allocator.

- </para>

-</sect3>

-<sect3 id="bitmap.impl.grow_policy" xreflabel="Grow Policy">

- <title>Overhead and Grow Policy</title>

- <para>

- Expected overhead per block would be 1 bit in memory. Also, once

- the address of the free list has been found, the cost for

- allocation/deallocation would be negligible, and is supposed to be

- constant time. For these very reasons, it is very important to

- minimize the linear time costs, which include finding a free list

- with a free block while allocating, and finding the corresponding

- free list for a block while deallocating. Therefore, I have

- decided that the growth of the internal pool for this allocator

- will be exponential as compared to linear for

- node_allocator. There, linear time works well, because we are

- mainly concerned with speed of allocation/deallocation and memory

- consumption, whereas here, the allocation/deallocation part does

- have some linear/logarithmic complexity components in it. Thus, to

- try and minimize them would be a good thing to do at the cost of a

- little bit of memory.

- </para>

- <para>

- Another thing to be noted is the pool size will double every time

- the internal pool gets exhausted, and all the free blocks have

- been given away. The initial size of the pool would be

- sizeof(size_t) x 8 which is the number of bits in an integer,

- which can fit exactly in a CPU register. Hence, the term given is

- exponential growth of the internal pool.

- </para>

-</sect3>

-</sect2>

-</sect1>

« no previous file with comments | « gcc/libstdc++-v3/doc/xml/manual/backwards_compatibility.xml ('k') | gcc/libstdc++-v3/doc/xml/manual/concurrency.xml » ('j') | no next file with comments »