third_party/re2/util/hash.cc - Issue 1544433002: Replace RE2 import with a dependency

Unified Diff: third_party/re2/util/hash.cc

Issue 1544433002: Replace RE2 import with a dependency (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Re-Added LICENSE and OWNERS file Created 5 years ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

Index: third_party/re2/util/hash.cc

diff --git a/third_party/re2/util/hash.cc b/third_party/re2/util/hash.cc

deleted file mode 100644

index dfef7b7c364d0d81f3f8de633f141d1757b63d9e..0000000000000000000000000000000000000000

--- a/third_party/re2/util/hash.cc

+++ /dev/null

@@ -1,231 +0,0 @@

-// Modified by Russ Cox to add "namespace re2".

-// Also threw away all but hashword and hashword2.

-// http://burtleburtle.net/bob/c/lookup3.c

-/*

--------------------------------------------------------------------------------

-lookup3.c, by Bob Jenkins, May 2006, Public Domain.

-These are functions for producing 32-bit hashes for hash table lookup.

-hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final()

-are externally useful functions. Routines to test the hash are included

-if SELF_TEST is defined. You can use this free for any purpose. It's in

-the public domain. It has no warranty.

-You probably want to use hashlittle(). hashlittle() and hashbig()

-hash byte arrays. hashlittle() is is faster than hashbig() on

-little-endian machines. Intel and AMD are little-endian machines.

-On second thought, you probably want hashlittle2(), which is identical to

-hashlittle() except it returns two 32-bit hashes for the price of one.

-You could implement hashbig2() if you wanted but I haven't bothered here.

-If you want to find a hash of, say, exactly 7 integers, do

- a = i1; b = i2; c = i3;

- mix(a,b,c);

- a += i4; b += i5; c += i6;

- mix(a,b,c);

- a += i7;

- final(a,b,c);

-then use c as the hash value. If you have a variable length array of

-4-byte integers to hash, use hashword(). If you have a byte array (like

-a character string), use hashlittle(). If you have several byte arrays, or

-a mix of things, see the comments above hashlittle().

-Why is this so big? I read 12 bytes at a time into 3 4-byte integers,

-then mix those integers. This is fast (you can do a lot more thorough

-mixing with 12*3 instructions on 3 integers than you can with 3 instructions

-on 1 byte), but shoehorning those bytes into integers efficiently is messy.

--------------------------------------------------------------------------------

-*/

-#include "util/util.h"

-#define rot(x,k) (((x)<<(k)) | ((x)>>(32-(k))))

-/*

--------------------------------------------------------------------------------

-mix -- mix 3 32-bit values reversibly.

-This is reversible, so any information in (a,b,c) before mix() is

-still in (a,b,c) after mix().

-If four pairs of (a,b,c) inputs are run through mix(), or through

-mix() in reverse, there are at least 32 bits of the output that

-are sometimes the same for one pair and different for another pair.

-This was tested for:

-* pairs that differed by one bit, by two bits, in any combination

- of top bits of (a,b,c), or in any combination of bottom bits of

- (a,b,c).

-* "differ" is defined as +, -, ^, or ~^. For + and -, I transformed

- the output delta to a Gray code (a^(a>>1)) so a string of 1's (as

- is commonly produced by subtraction) look like a single 1-bit

- difference.

-* the base values were pseudorandom, all zero but one bit set, or

- all zero plus a counter that starts at zero.

-Some k values for my "a-=c; a^=rot(c,k); c+=b;" arrangement that

-satisfy this are

- 4 6 8 16 19 4

- 9 15 3 18 27 15

- 14 9 3 7 17 3

-Well, "9 15 3 18 27 15" didn't quite get 32 bits diffing

-for "differ" defined as + with a one-bit base and a two-bit delta. I

-used http://burtleburtle.net/bob/hash/avalanche.html to choose

-the operations, constants, and arrangements of the variables.

-This does not achieve avalanche. There are input bits of (a,b,c)

-that fail to affect some output bits of (a,b,c), especially of a. The

-most thoroughly mixed value is c, but it doesn't really even achieve

-avalanche in c.

-This allows some parallelism. Read-after-writes are good at doubling

-the number of bits affected, so the goal of mixing pulls in the opposite

-direction as the goal of parallelism. I did what I could. Rotates

-seem to cost as much as shifts on every machine I could lay my hands

-on, and rotates are much kinder to the top and bottom bits, so I used

-rotates.

--------------------------------------------------------------------------------

-*/

-#define mix(a,b,c) \

-{ \

- a -= c; a ^= rot(c, 4); c += b; \

- b -= a; b ^= rot(a, 6); a += c; \

- c -= b; c ^= rot(b, 8); b += a; \

- a -= c; a ^= rot(c,16); c += b; \

- b -= a; b ^= rot(a,19); a += c; \

- c -= b; c ^= rot(b, 4); b += a; \

-/*

--------------------------------------------------------------------------------

-final -- final mixing of 3 32-bit values (a,b,c) into c

-Pairs of (a,b,c) values differing in only a few bits will usually

-produce values of c that look totally different. This was tested for

-* pairs that differed by one bit, by two bits, in any combination

- of top bits of (a,b,c), or in any combination of bottom bits of

- (a,b,c).

-* "differ" is defined as +, -, ^, or ~^. For + and -, I transformed

- the output delta to a Gray code (a^(a>>1)) so a string of 1's (as

- is commonly produced by subtraction) look like a single 1-bit

- difference.

-* the base values were pseudorandom, all zero but one bit set, or

- all zero plus a counter that starts at zero.

-These constants passed:

- 14 11 25 16 4 14 24

- 12 14 25 16 4 14 24

-and these came close:

- 4 8 15 26 3 22 24

- 10 8 15 26 3 22 24

- 11 8 15 26 3 22 24

--------------------------------------------------------------------------------

-*/

-#define final(a,b,c) \

-{ \

- c ^= b; c -= rot(b,14); \

- a ^= c; a -= rot(c,11); \

- b ^= a; b -= rot(a,25); \

- c ^= b; c -= rot(b,16); \

- a ^= c; a -= rot(c,4); \

- b ^= a; b -= rot(a,14); \

- c ^= b; c -= rot(b,24); \

-namespace re2 {

-/*

---------------------------------------------------------------------

- This works on all machines. To be useful, it requires

- -- that the key be an array of uint32_t's, and

- -- that the length be the number of uint32_t's in the key

- The function hashword() is identical to hashlittle() on little-endian

- machines, and identical to hashbig() on big-endian machines,

- except that the length has to be measured in uint32_ts rather than in

- bytes. hashlittle() is more complicated than hashword() only because

- hashlittle() has to dance around fitting the key bytes into registers.

---------------------------------------------------------------------

-*/

-uint32 hashword(

-const uint32 *k, /* the key, an array of uint32_t values */

-size_t length, /* the length of the key, in uint32_ts */

-uint32 initval) /* the previous hash, or an arbitrary value */

- uint32_t a,b,c;

- /* Set up the internal state */

- a = b = c = 0xdeadbeef + (((uint32_t)length)<<2) + initval;

- /*------------------------------------------------- handle most of the key */

- while (length > 3)

- {

- a += k[0];

- b += k[1];

- c += k[2];

- mix(a,b,c);

- length -= 3;

- k += 3;

- }

- /*------------------------------------------- handle the last 3 uint32_t's */

- switch(length) /* all the case statements fall through */

- {

- case 3 : c+=k[2];

- case 2 : b+=k[1];

- case 1 : a+=k[0];

- final(a,b,c);

- case 0: /* case 0: nothing left to add */

- break;

- }

- /*------------------------------------------------------ report the result */

- return c;

-/*

---------------------------------------------------------------------

-hashword2() -- same as hashword(), but take two seeds and return two

-32-bit values. pc and pb must both be nonnull, and *pc and *pb must

-both be initialized with seeds. If you pass in (*pb)==0, the output

-(*pc) will be the same as the return value from hashword().

---------------------------------------------------------------------

-*/

-void hashword2 (

-const uint32 *k, /* the key, an array of uint32_t values */

-size_t length, /* the length of the key, in uint32_ts */

-uint32 *pc, /* IN: seed OUT: primary hash value */

-uint32 *pb) /* IN: more seed OUT: secondary hash value */

- uint32_t a,b,c;

- /* Set up the internal state */

- a = b = c = 0xdeadbeef + ((uint32_t)(length<<2)) + *pc;

- c += *pb;

- /*------------------------------------------------- handle most of the key */

- while (length > 3)

- {

- a += k[0];

- b += k[1];

- c += k[2];

- mix(a,b,c);

- length -= 3;

- k += 3;

- }

- /*------------------------------------------- handle the last 3 uint32_t's */

- switch(length) /* all the case statements fall through */

- {

- case 3 : c+=k[2];

- case 2 : b+=k[1];

- case 1 : a+=k[0];

- final(a,b,c);

- case 0: /* case 0: nothing left to add */

- break;

- }

- /*------------------------------------------------------ report the result */

- *pc=c; *pb=b;

-} // namespace re2

« no previous file with comments | « third_party/re2/util/flags.h ('k') | third_party/re2/util/logging.h » ('j') | no next file with comments »