Main Content

Why would you want to use a Perl hash?

Archive - Originally posted on "The Horse's Mouth" - 2011-09-20 14:18:17 - Graham Ellis

"What would I use one of THOSE for?" - a question often asked by newcomers to Perl when I first introduce hashes. Well - in writing a script to count up how many articles I've written about each subject we teach on, I came up with a perfect example.

A hash is described as an "unordered collection" variable. That means that under one name, you can hold a whole lot of values just as you can in a list or an array - but you can key the elements by name not by position number. Consider the TWO lists:

1England
2Scotland
3Wales
4Northern Ireland
5Ireland


and


1London
2Edinbugh
3Cardiff
4Belfast
5Dublin


Using those two lists, I could find you the capital of a country by searching through the first list, and looking up the value at that position number in the second list. But wouldn't it be easier if I could just use a single table?:


EnglandLondon
ScotlandEdinburgh
WalesCardiff
Northern IrelandBelfast
IrelandDublin


And that's exactly what a hash lets you do. No need to search. No need to have (artificial) position numberings. No need to be VERY careful to keep the two lists syncronised as you edit them.


Here's some code from the example I wrote this morning. Firstly, here's how I set up a hash, keyed by the modules, to count the number of times each is mentioned in my data file:

  open FH, "/home/wellho/ include /bloglinks.txt";
  while (<FH>) {
    chop;
    ($ref,$mod) = split;
    $counter{$mod}++;
    }


Here's another example of setting up a hash, this time reading the name of each module from a file:


  open FH, "/home/wellho/ public_html /resources/mtable.txt" ;
  while (<FH>) {
    chop;
    ($ref,$about) = split(/\s+/,$_,2);
    $lookup{$ref} = $about;
    }


And then I printed out the results:

  print "\nHow many times have I talked about THAT?\n";
  for (sort {$counter{$a} <=> $counter{$b}} keys(%counter)) {
    print "$_ - $counter{$_} - $lookup{$_}\n";
    }



Of course, I wanted the results in order. And the irony is that you cannot sort a hash. But you can get a list of all of the keys, and then sort that list - and that's what I've done. The results:

  How many times have I talked about THAT?
  [snip]
  G102 - 34 - Things to do in Melksham
  H108 - 35 - Objects in PHP
  H115 - 35 - Designing PHP-Based Solutions: Best Practice
  M100 - 36 - Introduction to Well House Manor
  A603 - 38 - Further httpd Configuration
  H999 - 38 - Additional PHP Material
  A101 - 41 - Linux -An Introduction For Users
  M401 - 42 - Seeing how others do it
  G999 - 42 - Keynote
  H112 - 42 - Further Web Page and Network Handling
  Y105 - 43 - Functions, Modules and Packages
  Z511 - 43 - Public Transport - Road
  M300 - 43 - Behind the scenes
  [snip]


Isn't that great, short code? You can read more articles about hashes in Perl [here]. And the concept is so good you'll also find it used in Python's dictionaries [here], Ruby's hashes [here], Tcl's Arrays [here], Lua'a tables [here] and PHP's associative arrays [here]. In C, C++ and Java, you'll find the same structures available to you through libraries; in Java, they're the HashMap, HashSet and Hash within the java.util package - see [here].

The complete source code of the Perl examples is [here], and I've enclosed it in a CGI wrapper so that you can run it [here]. Well, actually that's to let me run it, as it's a useful reminder of what I haven't talked about in a while.

Want to learn more Perl? See our Perl Courses.

Pictures - Cardiff, Edinburgh, London, Dublin and Belfast - which is which?