gcc/libstdc++-v3/doc/html/ext/pb_ds/hash_based_containers.html - Issue 3050029: [gcc] GCC 4.5.0=>4.5.1

Unified Diff: gcc/libstdc++-v3/doc/html/ext/pb_ds/hash_based_containers.html

Issue 3050029: [gcc] GCC 4.5.0=>4.5.1 (Closed) Base URL: ssh://git@gitrw.chromium.org:9222/nacl-toolchain.git

Patch Set: Created 10 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

« no previous file with comments | « gcc/libstdc++-v3/doc/html/ext/pb_ds/gp_hash_tag.html ('k') | gcc/libstdc++-v3/doc/html/ext/pb_ds/hash_policy_cd.png » ('j') | no next file with comments »
Expand Comments ('e') | Collapse Comments ('c') | Hide Comments ('s')

Index: gcc/libstdc++-v3/doc/html/ext/pb_ds/hash_based_containers.html

diff --git a/gcc/libstdc++-v3/doc/html/ext/pb_ds/hash_based_containers.html b/gcc/libstdc++-v3/doc/html/ext/pb_ds/hash_based_containers.html

deleted file mode 100644

index 21d092a76ef19933d7716a81ecded8dbc1eef55d..0000000000000000000000000000000000000000

--- a/gcc/libstdc++-v3/doc/html/ext/pb_ds/hash_based_containers.html

+++ /dev/null

@@ -1,835 +0,0 @@

-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

- "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">

-<head>

- <meta name="generator" content=

- "HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" />

- <title>Hash-Based Containers</title>

- <meta http-equiv="Content-Type" content=

- "text/html; charset=us-ascii" />

- </head>

-<body>

- <div id="page">

- <h1>Hash Table Design</h1>

- <h2><a name="overview" id="overview">Overview</a></h2>

- The collision-chaining hash-based container has the

- following declaration.

- <pre>

-template<

- typename Key,

- typename Mapped,

- typename Hash_Fn = std::hash<Key>,

- typename Eq_Fn = std::equal_to<Key>,

- typename Comb_Hash_Fn = <a href=

-"direct_mask_range_hashing.html">direct_mask_range_hashing</a><>

- typename Resize_Policy = default explained below.

- bool Store_Hash = false,

- typename Allocator = std::allocator<char> >

-class <a href=

-"cc_hash_table.html">cc_hash_table</a>;

-</pre>

- The parameters have the following meaning:

- <ol>

- <li><tt>Key</tt> is the key type.</li>

- <li><tt>Mapped</tt> is the mapped-policy, and is explained in

- <a href="tutorial.html#assoc_ms">Tutorial::Associative

- Containers::Associative Containers Others than Maps</a>.</li>

- <li><tt>Hash_Fn</tt> is a key hashing functor.</li>

- <li><tt>Eq_Fn</tt> is a key equivalence functor.</li>

- <li><tt>Comb_Hash_Fn</tt> is a range-hashing_functor;

- it describes how to translate hash values into positions

- within the table. This is described in <a href=

- "#hash_policies">Hash Policies</a>.</li>

- <li><tt>Resize_Policy</tt> describes how a container object

- should change its internal size. This is described in

- <a href="#resize_policies">Resize Policies</a>.</li>

- <li><tt>Store_Hash</tt> indicates whether the hash value

- should be stored with each entry. This is described in

- <a href="#policy_interaction">Policy Interaction</a>.</li>

- <li><tt>Allocator</tt> is an allocator

- type.</li>

- </ol>

- The probing hash-based container has the following

- declaration.

- <pre>

-template<

- typename Key,

- typename Mapped,

- typename Hash_Fn = std::hash<Key>,

- typename Eq_Fn = std::equal_to<Key>,

- typename Comb_Probe_Fn = <a href=

-"direct_mask_range_hashing.html">direct_mask_range_hashing</a><>

- typename Probe_Fn = default explained below.

- typename Resize_Policy = default explained below.

- bool Store_Hash = false,

- typename Allocator = std::allocator<char> >

-class <a href=

-"gp_hash_table.html">gp_hash_table</a>;

-</pre>

- The parameters are identical to those of the

- collision-chaining container, except for the following.

- <ol>

- <li><tt>Comb_Probe_Fn</tt> describes how to transform a probe

- sequence into a sequence of positions within the table.</li>

- <li><tt>Probe_Fn</tt> describes a probe sequence policy.</li>

- </ol>

- Some of the default template values depend on the values of

- other parameters, and are explained in <a href=

- "#policy_interaction">Policy Interaction</a>.

- <h2><a name="hash_policies" id="hash_policies">Hash

- Policies</a></h2>

- <h3><a name="general_terms" id="general_terms">General

- Terms</a></h3>

- Following is an explanation of some functions which hashing

- involves. Figure <a href=

- "#hash_ranged_hash_range_hashing_fns">Hash functions,

- ranged-hash functions, and range-hashing functions</a>)

- illustrates the discussion.

- <h6 class="c1"><a name="hash_ranged_hash_range_hashing_fns" id=

- "hash_ranged_hash_range_hashing_fns"><img src=

- "hash_ranged_hash_range_hashing_fns.png" alt=

- "no image" /></a></h6>

- <h6 class="c1">Hash functions, ranged-hash functions, and

- range-hashing functions.</h6>

- Let U be a domain (e.g., the integers, or the

- strings of 3 characters). A hash-table algorithm needs to map

- elements of U "uniformly" into the range [0,..., m -

- 1] (where m is a non-negative integral value, and

- is, in general, time varying). I.e., the algorithm needs

- a ranged-hash function

- f : U × Z+ → Z+

- ,

- such that for any u in U ,

- 0 ≤ f(u, m) ≤ m - 1 ,

- and which has "good uniformity" properties [<a href=

- "references.html#knuth98sorting">knuth98sorting</a>]. One

- common solution is to use the composition of the hash

- function

- h : U → Z+ ,

- which maps elements of U into the non-negative

- integrals, and

- g : Z+ × Z+ →

- Z+,

- which maps a non-negative hash value, and a non-negative

- range upper-bound into a non-negative integral in the range

- between 0 (inclusive) and the range upper bound (exclusive),

- i.e., for any r in Z+,

- 0 ≤ g(r, m) ≤ m - 1 .

- The resulting ranged-hash function, is

- <a name="ranged_hash_composed_of_hash_and_range_hashing"

- id="ranged_hash_composed_of_hash_and_range_hashing">f(u , m) =

- g(h(u), m)</a> (1) .

- From the above, it is obvious that given g and

- h, f can always be composed (however the converse

- is not true). The STL's hash-based containers allow specifying

- a hash function, and use a hard-wired range-hashing function;

- the ranged-hash function is implicitly composed.

- The above describes the case where a key is to be mapped

- into a single position within a hash table, e.g.,

- in a collision-chaining table. In other cases, a key is to be

- mapped into a sequence of positions within a table,

- e.g., in a probing table. Similar terms apply in this

- case: the table requires a ranged probe function,

- mapping a key into a sequence of positions withing the table.

- This is typically achieved by composing a hash function

- mapping the key into a non-negative integral type, a

- probe function transforming the hash value into a

- sequence of hash values, and a range-hashing function

- transforming the sequence of hash values into a sequence of

- positions.

- <h3><a name="range_hashing_fns" id=

- "range_hashing_fns">Range-Hashing Functions</a></h3>

- Some common choices for range-hashing functions are the

- division, multiplication, and middle-square methods [<a href=

- "references.html#knuth98sorting">knuth98sorting</a>], defined

- as

- <a name="division_method" id="division_method">g(r, m) =

- r mod m</a> (2) ,

- g(r, m) = &lceil; u/v ( a r mod v ) &rceil; ,

- and

- g(r, m) = &lceil; u/v ( r2 mod v ) &rceil;

- ,

- respectively, for some positive integrals u and

- v (typically powers of 2), and some a. Each of

- these range-hashing functions works best for some different

- setting.

- The division method <a href="#division_method">(2)</a> is a

- very common choice. However, even this single method can be

- implemented in two very different ways. It is possible to

- implement <a href="#division_method">(2)</a> using the low

- level % (modulo) operation (for any m), or the

- low level & (bit-mask) operation (for the case where

- m is a power of 2), i.e.,

- <a name="division_method_prime_mod" id=

- "division_method_prime_mod">g(r, m) = r % m</a> (3) ,

- and

- <a name="division_method_bit_mask" id=

- "division_method_bit_mask">g(r, m) = r & m - 1, (m =

- 2k)</a> for some k) (4),

- respectively.

- The % (modulo) implementation <a href=

- "#division_method_prime_mod">(3)</a> has the advantage that for

- m a prime far from a power of 2, g(r, m) is

- affected by all the bits of r (minimizing the chance of

- collision). It has the disadvantage of using the costly modulo

- operation. This method is hard-wired into SGI's implementation

- [<a href="references.html#sgi_stl">sgi_stl</a>].

- The & (bit-mask) implementation <a href=

- "#division_method_bit_mask">(4)</a> has the advantage of

- relying on the fast bit-wise and operation. It has the

- disadvantage that for g(r, m) is affected only by the

- low order bits of r. This method is hard-wired into

- Dinkumware's implementation [<a href=

- "references.html#dinkumware_stl">dinkumware_stl</a>].

- <h3><a name="hash_policies_ranged_hash_policies" id=

- "hash_policies_ranged_hash_policies">Ranged-Hash

- Functions</a></h3>

- In cases it is beneficial to allow the

- client to directly specify a ranged-hash hash function. It is

- true, that the writer of the ranged-hash function cannot rely

- on the values of m having specific numerical properties

- suitable for hashing (in the sense used in [<a href=

- "references.html#knuth98sorting">knuth98sorting</a>]), since

- the values of m are determined by a resize policy with

- possibly orthogonal considerations.

- There are two cases where a ranged-hash function can be

- superior. The firs is when using perfect hashing [<a href=

- "references.html#knuth98sorting">knuth98sorting</a>]; the

- second is when the values of m can be used to estimate

- the "general" number of distinct values required. This is

- described in the following.

- Let

- s = [ s0,..., st - 1]

- be a string of t characters, each of which is from

- domain S. Consider the following ranged-hash

- function:

- <a name="total_string_dna_hash" id=

- "total_string_dna_hash">f1(s, m) = ∑ i =

- 0t - 1 si ai mod

- m</a> (5) ,

- where a is some non-negative integral value. This is

- the standard string-hashing function used in SGI's

- implementation (with a = 5) [<a href=

- "references.html#sgi_stl">sgi_stl</a>]. Its advantage is that

- it takes into account all of the characters of the string.

- Now assume that s is the string representation of a

- of a long DNA sequence (and so S = {'A', 'C', 'G',

- 'T'}). In this case, scanning the entire string might be

- prohibitively expensive. A possible alternative might be to use

- only the first k characters of the string, where

- |S|k ≥ m ,

- i.e., using the hash function

- <a name="only_k_string_dna_hash" id=

- "only_k_string_dna_hash">f2(s, m) = ∑ i

- = 0k - 1 si ai mod

- m</a> , (6)

- requiring scanning over only

- k = log4( m )

- characters.

- Other more elaborate hash-functions might scan k

- characters starting at a random position (determined at each

- resize), or scanning k random positions (determined at

- each resize), i.e., using

- f3(s, m) = ∑ i =

- r0r0 + k - 1 si

- ai mod m ,

- or

- f4(s, m) = ∑ i = 0k -

- 1 sri ari mod

- m ,

- respectively, for r0,..., rk-1

- each in the (inclusive) range [0,...,t-1].

- It should be noted that the above functions cannot be

- decomposed as <a href=

- "#ranged_hash_composed_of_hash_and_range_hashing">(1)</a> .

- <h3><a name="pb_ds_imp" id="pb_ds_imp">Implementation</a></h3>

- This sub-subsection describes the implementation of the

- above in <tt>pb_ds</tt>. It first explains range-hashing

- functions in collision-chaining tables, then ranged-hash

- functions in collision-chaining tables, then probing-based

- tables, and, finally, lists the relevant classes in

- <tt>pb_ds</tt>.

- <h4>Range-Hashing and Ranged-Hashes in Collision-Chaining

- Tables</h4>

- <a href=

- "cc_hash_table.html"><tt>cc_hash_table</tt></a> is

- parametrized by <tt>Hash_Fn</tt> and <tt>Comb_Hash_Fn</tt>, a

- hash functor and a combining hash functor, respectively.

- In general, <tt>Comb_Hash_Fn</tt> is considered a

- range-hashing functor. <a href=

- "cc_hash_table.html"><tt>cc_hash_table</tt></a>

- synthesizes a ranged-hash function from <tt>Hash_Fn</tt> and

- <tt>Comb_Hash_Fn</tt> (see <a href=

- "#ranged_hash_composed_of_hash_and_range_hashing">(1)</a>

- above). Figure <a href="#hash_range_hashing_seq_diagram">Insert

- hash sequence diagram</a> shows an <tt>insert</tt> sequence

- diagram for this case. The user inserts an element (point A),

- the container transforms the key into a non-negative integral

- using the hash functor (points B and C), and transforms the

- result into a position using the combining functor (points D

- and E).

- <h6 class="c1"><a name="hash_range_hashing_seq_diagram" id=

- "hash_range_hashing_seq_diagram"><img src=

- "hash_range_hashing_seq_diagram.png" alt="no image" /></a></h6>

- <h6 class="c1">Insert hash sequence diagram.</h6>

- If <a href=

- "cc_hash_table.html"><tt>cc_hash_table</tt></a>'s

- hash-functor, <tt>Hash_Fn</tt> is instantiated by <a href=

- "null_hash_fn.html"><tt>null_hash_fn</tt></a> (see <a href=

- "concepts.html#concepts_null_policies">Interface::Concepts::Null

- Policy Classes</a>), then <tt>Comb_Hash_Fn</tt> is taken to be

- a ranged-hash function. Figure <a href=

- "#hash_range_hashing_seq_diagram2">Insert hash sequence diagram

- with a null hash policy</a> shows an <tt>insert</tt> sequence

- diagram. The user inserts an element (point A), the container

- transforms the key into a position using the combining functor

- (points B and C).

- <h6 class="c1"><a name="hash_range_hashing_seq_diagram2" id=

- "hash_range_hashing_seq_diagram2"><img src=

- "hash_range_hashing_seq_diagram2.png" alt=

- "no image" /></a></h6>

- <h6 class="c1">Insert hash sequence diagram with a null hash

- policy.</h6>

- <h4>Probing Tables</h4>

- <a href=

- "gp_hash_table.html"></a><tt>gp_hash_table</tt> is

- parametrized by <tt>Hash_Fn</tt>, <tt>Probe_Fn</tt>, and

- <tt>Comb_Probe_Fn</tt>. As before, if <tt>Hash_Fn</tt> and

- <tt>Probe_Fn</tt> are, respectively, <a href=

- "null_hash_fn.html"><tt>null_hash_fn</tt></a> and <a href=

- "null_probe_fn.html"><tt>null_probe_fn</tt></a>, then

- <tt>Comb_Probe_Fn</tt> is a ranged-probe functor. Otherwise,

- <tt>Hash_Fn</tt> is a hash functor, <tt>Probe_Fn</tt> is a

- functor for offsets from a hash value, and

- <tt>Comb_Probe_Fn</tt> transforms a probe sequence into a

- sequence of positions within the table.

- <h4>Pre-Defined Policies</h4>

- <tt>pb_ds</tt> contains some pre-defined classes

- implementing range-hashing and probing functions:

- <ol>

- <li><a href=

- "direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a>

- and <a href=

- "direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>

- are range-hashing functions based on a bit-mask and a modulo

- operation, respectively.</li>

- <li><a href=

- "linear_probe_fn.html"><tt>linear_probe_fn</tt></a>, and

- <a href=

- "quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a> are

- a linear probe and a quadratic probe function,

- respectively.</li>

- </ol>Figure <a href="#hash_policy_cd">Hash policy class

- diagram</a> shows a class diagram.

- <h6 class="c1"><a name="hash_policy_cd" id=

- "hash_policy_cd"><img src="hash_policy_cd.png" alt=

- "no image" /></a></h6>

- <h6 class="c1">Hash policy class diagram.</h6>

- <h2><a name="resize_policies" id="resize_policies">Resize

- Policies</a></h2>

- <h3><a name="general" id="general">General Terms</a></h3>

- Hash-tables, as opposed to trees, do not naturally grow or

- shrink. It is necessary to specify policies to determine how

- and when a hash table should change its size. Usually, resize

- policies can be decomposed into orthogonal policies:

- <ol>

- <li>A size policy indicating how a hash table

- should grow (e.g., it should multiply by powers of

- 2).</li>

- <li>A trigger policy indicating when a hash

- table should grow (e.g., a load factor is

- exceeded).</li>

- </ol>

- <h3><a name="size_policies" id="size_policies">Size

- Policies</a></h3>

- Size policies determine how a hash table changes size. These

- policies are simple, and there are relatively few sensible

- options. An exponential-size policy (with the initial size and

- growth factors both powers of 2) works well with a mask-based

- range-hashing function (see <a href=

- "#hash_policies">Range-Hashing Policies</a>), and is the

- hard-wired policy used by Dinkumware [<a href=

- "references.html#dinkumware_stl">dinkumware_stl</a>]. A

- prime-list based policy works well with a modulo-prime range

- hashing function (see <a href="#hash_policies">Range-Hashing

- Policies</a>), and is the hard-wired policy used by SGI's

- implementation [<a href=

- "references.html#sgi_stl">sgi_stl</a>].

- <h3><a name="trigger_policies" id="trigger_policies">Trigger

- Policies</a></h3>

- Trigger policies determine when a hash table changes size.

- Following is a description of two policies: load-check

- policies, and collision-check policies.

- Load-check policies are straightforward. The user specifies

- two factors, αmin and

- αmax, and the hash table maintains the

- invariant that

- <a name="load_factor_min_max" id=

- "load_factor_min_max">αmin ≤ (number of

- stored elements) / (hash-table size) ≤

- αmax</a> (1) .

- Collision-check policies work in the opposite direction of

- load-check policies. They focus on keeping the number of

- collisions moderate and hoping that the size of the table will

- not grow very large, instead of keeping a moderate load-factor

- and hoping that the number of collisions will be small. A

- maximal collision-check policy resizes when the longest

- probe-sequence grows too large.

- Consider Figure <a href="#balls_and_bins">Balls and

- bins</a>. Let the size of the hash table be denoted by

- m, the length of a probe sequence be denoted by

- k, and some load factor be denoted by α. We would

- like to calculate the minimal length of k, such that if

- there were α m elements in the hash table, a probe

- sequence of length k would be found with probability at

- most 1/m.

- <h6 class="c1"><a name="balls_and_bins" id=

- "balls_and_bins"><img src="balls_and_bins.png" alt=

- "no image" /></a></h6>

- <h6 class="c1">Balls and bins.</h6>

- Denote the probability that a probe sequence of length

- k appears in bin i by pi, the

- length of the probe sequence of bin i by

- li, and assume uniform distribution. Then

- <a name="prob_of_p1" id=

- "prob_of_p1">p1</a> = (3)

- P(l1 ≥ k) =

- P(l1 ≥ α ( 1 + k / α - 1

- ) ≤ (a)

- e ^ ( - ( α ( k / α - 1 )2 ) /2

- ) ,

- where (a) follows from the Chernoff bound [<a href=

- "references.html#motwani95random">motwani95random</a>]. To

- calculate the probability that some bin contains a probe

- sequence greater than k, we note that the

- li are negatively-dependent [<a href=

- "references.html#dubhashi98neg">dubhashi98neg</a>]. Let

- I(.) denote the indicator function. Then

- <a name="at_least_k_i_n_some_bin" id=

- "at_least_k_i_n_some_bin">P( existsi

- li ≥ k ) = (3)</a>

- P ( ∑ i = 1m

- I(li ≥ k) ≥ 1 ) =

- P ( ∑ i = 1m I (

- li ≥ k ) ≥ m p1 ( 1 + 1 / (m

- p1) - 1 ) ) ≤ (a)

- e ^ ( ( - m p1 ( 1 / (m p1)

- - 1 ) 2 ) / 2 ) ,

- where (a) follows from the fact that the Chernoff bound can

- be applied to negatively-dependent variables [<a href=

- "references.html#dubhashi98neg">dubhashi98neg</a>]. Inserting

- <a href="#prob_of_p1">(2)</a> into <a href=

- "#at_least_k_i_n_some_bin">(3)</a>, and equating with

- 1/m, we obtain

- k ~ √ ( 2 α ln 2 m ln(m) )

- ) .

- <h3><a name="imp_pb_ds" id="imp_pb_ds">Implementation</a></h3>

- This sub-subsection describes the implementation of the

- above in <tt>pb_ds</tt>. It first describes resize policies and

- their decomposition into trigger and size policies, then

- describes pre-defined classes, and finally discusses controlled

- access the policies' internals.

- <h4>Resize Policies and Their Decomposition</h4>

- Each hash-based container is parametrized by a

- <tt>Resize_Policy</tt> parameter; the container derives

- <tt>public</tt>ly from <tt>Resize_Policy</tt>. For

- example:

- <pre>

-<a href="cc_hash_table.html">cc_hash_table</a><

- typename Key,

- typename Mapped,

- ...

- typename Resize_Policy

- ...> :

- public Resize_Policy

-</pre>

- As a container object is modified, it continuously notifies

- its <tt>Resize_Policy</tt> base of internal changes

- (e.g., collisions encountered and elements being

- inserted). It queries its <tt>Resize_Policy</tt> base whether

- it needs to be resized, and if so, to what size.

- Figure <a href="#insert_resize_sequence_diagram1">Insert

- resize sequence diagram</a> shows a (possible) sequence diagram

- of an insert operation. The user inserts an element; the hash

- table notifies its resize policy that a search has started

- (point A); in this case, a single collision is encountered -

- the table notifies its resize policy of this (point B); the

- container finally notifies its resize policy that the search

- has ended (point C); it then queries its resize policy whether

- a resize is needed, and if so, what is the new size (points D

- to G); following the resize, it notifies the policy that a

- resize has completed (point H); finally, the element is

- inserted, and the policy notified (point I).

- <h6 class="c1"><a name="insert_resize_sequence_diagram1" id=

- "insert_resize_sequence_diagram1"><img src=

- "insert_resize_sequence_diagram1.png" alt=

- "no image" /></a></h6>

- <h6 class="c1">Insert resize sequence diagram.</h6>

- In practice, a resize policy can be usually orthogonally

- decomposed to a size policy and a trigger policy. Consequently,

- the library contains a single class for instantiating a resize

- policy: <a href=

- "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>

- is parametrized by <tt>Size_Policy</tt> and

- <tt>Trigger_Policy</tt>, derives <tt>public</tt>ly from

- both, and acts as a standard delegate [<a href=

- "references.html#gamma95designpatterns">gamma95designpatterns</a>]

- to these policies.

- Figures <a href="#insert_resize_sequence_diagram2">Standard

- resize policy trigger sequence diagram</a> and <a href=

- "#insert_resize_sequence_diagram3">Standard resize policy size

- sequence diagram</a> show sequence diagrams illustrating the

- interaction between the standard resize policy and its trigger

- and size policies, respectively.

- <h6 class="c1"><a name="insert_resize_sequence_diagram2" id=

- "insert_resize_sequence_diagram2"><img src=

- "insert_resize_sequence_diagram2.png" alt=

- "no image" /></a></h6>

- <h6 class="c1">Standard resize policy trigger sequence

- diagram.</h6>

- <h6 class="c1"><a name="insert_resize_sequence_diagram3" id=

- "insert_resize_sequence_diagram3"><img src=

- "insert_resize_sequence_diagram3.png" alt=

- "no image" /></a></h6>

- <h6 class="c1">Standard resize policy size sequence

- diagram.</h6>

- <h4>Pre-Defined Policies</h4>

- The library includes the following

- instantiations of size and trigger policies:

- <ol>

- <li><a href=

- "hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>

- implements a load check trigger policy.</li>

- <li><a href=

- "cc_hash_max_collision_check_resize_trigger.html"><tt>cc_hash_max_collision_check_resize_trigger</tt></a>

- implements a collision check trigger policy.</li>

- <li><a href=

- "hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a>

- implements an exponential-size policy (which should be used

- with mask range hashing).</li>

- <li><a href=

- "hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>

- implementing a size policy based on a sequence of primes

- [<a href="references.html#sgi_stl">sgi_stl</a>] (which should

- be used with mod range hashing</li>

- </ol>

- Figure <a href="#resize_policy_cd">Resize policy class

- diagram</a> gives an overall picture of the resize-related

- classes. <a href=

- "basic_hash_table.html"><tt>basic_hash_table</tt></a>

- is parametrized by <tt>Resize_Policy</tt>, which it subclasses

- publicly. This class is currently instantiated only by <a href=

- "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>.

- <a href=

- "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>

- itself is parametrized by <tt>Trigger_Policy</tt> and

- <tt>Size_Policy</tt>. Currently, <tt>Trigger_Policy</tt> is

- instantiated by <a href=

- "hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>,

- or <a href=

- "cc_hash_max_collision_check_resize_trigger.html"><tt>cc_hash_max_collision_check_resize_trigger</tt></a>;

- <tt>Size_Policy</tt> is instantiated by <a href=

- "hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a>,

- or <a href=

- "hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>.

- <h6 class="c1"><a name="resize_policy_cd" id=

- "resize_policy_cd"><img src="resize_policy_cd.png" alt=

- "no image" /></a></h6>

- <h6 class="c1">Resize policy class diagram.</h6>

- <h4>Controlled Access to Policies' Internals</h4>

- There are cases where (controlled) access to resize

- policies' internals is beneficial. E.g., it is sometimes

- useful to query a hash-table for the table's actual size (as

- opposed to its <tt>size()</tt> - the number of values it

- currently holds); it is sometimes useful to set a table's

- initial size, externally resize it, or change load factors.

- Clearly, supporting such methods both decreases the

- encapsulation of hash-based containers, and increases the

- diversity between different associative-containers' interfaces.

- Conversely, omitting such methods can decrease containers'

- flexibility.

- In order to avoid, to the extent possible, the above

- conflict, the hash-based containers themselves do not address

- any of these questions; this is deferred to the resize policies,

- which are easier to change or replace. Thus, for example,

- neither <a href=

- "cc_hash_table.html"><tt>cc_hash_table</tt></a> nor

- <a href=

- "gp_hash_table.html"><tt>gp_hash_table</tt></a>

- contain methods for querying the actual size of the table; this

- is deferred to <a href=

- "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>.

- Furthermore, the policies themselves are parametrized by

- template arguments that determine the methods they support

- ([<a href=

- "references.html#alexandrescu01modern">alexandrescu01modern</a>]

- shows techniques for doing so). <a href=

- "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>

- is parametrized by <tt>External_Size_Access</tt> that

- determines whether it supports methods for querying the actual

- size of the table or resizing it. <a href=

- "hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>

- is parametrized by <tt>External_Load_Access</tt> that

- determines whether it supports methods for querying or

- modifying the loads. <a href=

- "cc_hash_max_collision_check_resize_trigger.html"><tt>cc_hash_max_collision_check_resize_trigger</tt></a>

- is parametrized by <tt>External_Load_Access</tt> that

- determines whether it supports methods for querying the

- load.

- Some operations, for example, resizing a container at

- run time, or changing the load factors of a load-check trigger

- policy, require the container itself to resize. As mentioned

- above, the hash-based containers themselves do not contain

- these types of methods, only their resize policies.

- Consequently, there must be some mechanism for a resize policy

- to manipulate the hash-based container. As the hash-based

- container is a subclass of the resize policy, this is done

- through virtual methods. Each hash-based container has a

- <tt>private</tt> <tt>virtual</tt> method:

- <pre>

-virtual void

- do_resize

- (size_type new_size);

-</pre>

- which resizes the container. Implementations of

- <tt>Resize_Policy</tt> can export public methods for resizing

- the container externally; these methods internally call

- <tt>do_resize</tt> to resize the table.

- <h2><a name="policy_interaction" id="policy_interaction">Policy

- Interaction</a></h2>

- Hash-tables are unfortunately especially susceptible to

- choice of policies. One of the more complicated aspects of this

- is that poor combinations of good policies can form a poor

- container. Following are some considerations.

- <h3><a name="policy_interaction_probe_size_trigger" id=

- "policy_interaction_probe_size_trigger">Probe Policies, Size

- Policies, and Trigger Policies</a></h3>

- Some combinations do not work well for probing containers.

- For example, combining a quadratic probe policy with an

- exponential size policy can yield a poor container: when an

- element is inserted, a trigger policy might decide that there

- is no need to resize, as the table still contains unused

- entries; the probe sequence, however, might never reach any of

- the unused entries.

- Unfortunately, <tt>pb_ds</tt> cannot detect such problems at

- compilation (they are halting reducible). It therefore defines

- an exception class <a href=

- "insert_error.html"><tt>insert_error</tt></a> to throw an

- exception in this case.

- <h3><a name="policy_interaction_hash_trigger" id=

- "policy_interaction_hash_trigger">Hash Policies and Trigger

- Policies</a></h3>

- Some trigger policies are especially susceptible to poor

- hash functions. Suppose, as an extreme case, that the hash

- function transforms each key to the same hash value. After some

- inserts, a collision detecting policy will always indicate that

- the container needs to grow.

- The library, therefore, by design, limits each operation to

- one resize. For each <tt>insert</tt>, for example, it queries

- only once whether a resize is needed.

- <h3><a name="policy_interaction_eq_sth_hash" id=

- "policy_interaction_eq_sth_hash">Equivalence Functors, Storing

- Hash Values, and Hash Functions</a></h3>

- <a href=

- "cc_hash_table.html"><tt>cc_hash_table</tt></a> and

- <a href=

- "gp_hash_table.html"><tt>gp_hash_table</tt></a> are

- parametrized by an equivalence functor and by a

- <tt>Store_Hash</tt> parameter. If the latter parameter is

- <tt>true</tt>, then the container stores with each entry

- a hash value, and uses this value in case of collisions to

- determine whether to apply a hash value. This can lower the

- cost of collision for some types, but increase the cost of

- collisions for other types.

- If a ranged-hash function or ranged probe function is

- directly supplied, however, then it makes no sense to store the

- hash value with each entry. <tt>pb_ds</tt>'s container will

- fail at compilation, by design, if this is attempted.

- <h3><a name="policy_interaction_size_load_check" id=

- "policy_interaction_size_load_check">Size Policies and

- Load-Check Trigger Policies</a></h3>

- Assume a size policy issues an increasing sequence of sizes

- a, a q, a q1, a q2, ... For

- example, an exponential size policy might issue the sequence of

- sizes 8, 16, 32, 64, ...

- If a load-check trigger policy is used, with loads

- αmin and αmax,

- respectively, then it is a good idea to have:

- <ol>

- <li>αmax ~ 1 / q</li>

- <li>αmin < 1 / (2 q)</li>

- </ol>

- This will ensure that the amortized hash cost of each

- modifying operation is at most approximately 3.

- αmin ~ αmax is, in

- any case, a bad choice, and αmin >

- αmax is horrendous.

- </div>

-</body>

-</html>

« no previous file with comments | « gcc/libstdc++-v3/doc/html/ext/pb_ds/gp_hash_tag.html ('k') | gcc/libstdc++-v3/doc/html/ext/pb_ds/hash_policy_cd.png » ('j') | no next file with comments »