today I embarked on a learning journey. I wanted to learn about hash function. so I started by writing a hashmap datastructure in C. then I read good ol' donald knuth so that I could understand some sort of hash function -- hopefully to understand something general, so that I could design hash functions.
I ended up learning about fibonacci hashing, where if you take just the fractional part of (n /times recip of golden ratio), you get a sequence of relatively uniformly distributed numbers in [0,1). the idea is that for every new element, it goes as the golden ratio split of the largest interval; a new interval is added with each element added. see Fig 37 within TAOCP vo3(2ed),section 6.4.
needless to say, very cool. did I journey as deep as understanding the mathematical proof for the theorem? nah, but I guess I was satisfied.
what was most cool is my implementation of the hash function. I ended up doing some fixed point math, which by the way, I only picked up very recently. It was wonderful to see this come together.
finally, I ended up needing a log2 function for integers. so I reused one that I discovered sometime ago now -- it's from the stanford bithacks page. you basically use a 64bit floating point number and exploit the bitshift that occurs during normalization to extract log2 value directly from the float exponent ... crazy stuff, I know.
I guess for fun and as a trust building exercise, here's the hash function:
#define HM_HASH_DEBUG 1
static size_t _hm_hash(hashmap_t hm, uint32_t K)
{
float alpha = (sqrt(5.0)-1.0)/2.0;
uint32_t w = UINT32_MAX;
uint32_t A = (uint32_t)(alpha * ((uint64_t)1<<32));//cvt to fixed point.
uint64_t hofK = (((uint64_t)A*K))%w;//take just the fractional part.
#if HM_HASH_DEBUG
printf("hofK=%f given %u\n",
(float)hofK * 2.3283064365386963e-10,
K);
#endif
int m=log2_with_float64(hm.capacity);
#if HM_HASH_DEBUG
printf("m:%d\n", m);
#endif
hofK<<=m;
hofK>>=32;
#if HM_HASH_DEBUG
printf("hm.capacity:%d,hofK finale=%u\n",
hm.capacity, hofK);
#endif
return hofK;
}
#undef HM_HASH_DEBUG