Buckets
Archive - Originally posted on "The Horse's Mouth" - 2010-12-26 08:22:09 - Graham Ellis
Programs need to dynamically allocate memory to store data as they run - for as a program is written, the programmer won't know how much data may be used in due course through the program at any one time. So the variable storage memory is usually arranged into a symbol table which holds the names of the variables currently in use, and an address where the values associated with may be found, and a heap which is where the actual values are held. The heap can grow in size as the program runs, so that there's no need in modern languages to code with limits and fixed sizes, and heap memory can also be released (when a variable isn't needed any more or has shrunk in size) for re-use by other variables. [Due to operating system constraints, the heap can't usually shrink].
If all this clever heap 'stuff' was done by allocating memory on a byte by byte basis, more space and time would be taken up administering the memory use than doing the actual work, so the allocation is typically done using larger units - and each of these larger units is called a bucket
You rarely see bucket allocation details, even as a programmer - it's too low level. But one of the places that you can see it is within Perl's hashes - if you refer to a hash in a scalar context, you'll get a result like
3/8
which means that the hash has 8 buckets allocated to it, and of those three are currently occupied with data. It's also interesting to watch as a perl hash grows, and see the hashtable being resizes as it gets almost full, with the number of buckets allocated being doubled and redoubled ...
for ($k=1; $k<4000; $k++) {
$fing{$k} = $k + 55;
print ("$k - ",($n = %fing),"\n");
}
gives the following results:
1 - 1/8
2 - 2/8
3 - 3/8
4 - 3/8
5 - 4/8
6 - 5/8
7 - 6/8
8 - 7/16
9 - 8/16
10 - 9/16
11 - 10/16
12 - 10/16
13 - 10/16
14 - 11/16
15 - 11/16
16 - 14/32
17 - 14/32
18 - 15/32
Finally, it's interesting to note a considerable increase in the number of buckets occupied each time the size of the hash is increased, indicative of the use of a longer hash key as the size of the hash table increases.
This code - in a complete example - is available [here]. The image is adapted from Free Clipart Now - "a large collection of high quality, public domain clipart graphics for presentations, web pages, documents, emails...".