Looking up a value by key - associative arrays / Hashes / Dictionaries
Archive - Originally posted on "The Horse's Mouth" - 2010-08-11 06:31:28 - Graham Ellis
In any substantial programming task, you'll want to store a whole series of values in some sort of table, so that you can process / reprocess / handle each of the elements in turn - and to do that, you'll use a variable construct that's called an "array", or a "list", a "tuple" or a "vector". (The exact word depends on the particular language you're using ... and how it's been implemented internally).
Arrays, lists, tuples and vectors are what we call "ordered collections" - in other words, they're bundles of variables which are keyed / looked up by position numbers. The numbering usually starts at 0 (exceptions - Lua and Fortran which start at 1), and in many cases that's ideal - but not in all cases.
Imagine that I have a data set which shows the UK ports from which you can take a ferry over the English Channel, and their destinations - [here]. The sort of question I'll be asking is "where can I go from Portsmouth" and if I keep the data in an array or list - keyed by position number - I'll have to do a search of some sort every time I make an inquiry; chances are that I'll need two arrays which have to be kept in step. Ironically, my user really won't care if Portsmouth is at position 0, 1, 2, or 3 ... nor if Newhaven comes first ... There has to be a better way!
There is. It's called an "Associative Array" (PHP), a "Dictionary" (Python), a "Hash" (Perl), (a table in Lua, a Hash, Hashmap or hashset in Java ...). In dictionaries and hashes, the elements are not orderable but can be looked up by key. There's a source code example (in Perl) [here] showing the setup and use of a hash for looking up our channel ports from our channel file - and we cover this in detail on our Learning to program in Perl course.
In PHP, things are a bit different. The Associative array CAN be ordered (sorted) - which overcomes one of the limitations of hashes and dictionaries, at the expense of efficiency on enormous data tables. But since PHP is going to be used primarily for web page generation, where the time taken / data sent can't be huge, this loss of efficiency in exceptional cases is no big deal. What does this mean?
It means that in PHP, I can set up a table of data - key, value pairs. I can then sort that table, and output the results in order. And I can do it VERY EASILY! I wrote an example of this - using fictional data for the number of deer in various national parks in the UK - on yesterday's PHP course, and I've shared the source code [here]. The code is robust (it has to be if it's going to be run on a public web site!) and you can run it [here].